Python Module - jsonschema Part 3
Posted on Mar 28, 2018 in Python Module/Package Recommendations by Amo Chen ‐ 4 min read
This post is part of a series on the Python module - jsonschema:
In Part 2 of the Python Module - jsonschema, we covered the complex usage of types like number
, string
, array
, and object
. However, most examples focused on validating individual data types, while in practice, JSON data can often involve a mixture of multiple data types. For example, an array
might contain object
elements, and an object
might include nested objects
. Consider the following JSON data:
[
{
"user_id": 1,
"preference": {
"cooking": True,
"fishing": False,
}
},
{
"user_id": 1,
"preference": {
"cooking": True,
"fishing": False,
}
},
]
This section will introduce how to write JSON Schemas that are more practical for real-world use and easier to maintain.
definitions
and $ref
In different JSON data formats, common structures often appear. JSON Schema offers the definitions
keyword to manage these shared JSON formats in one place, allowing any necessary JSON Schema to reference them via the $ref
keyword.
For example, by placing the definition of preference into definitions
, and referencing it using $ref
in the JSON Schema, it simplifies the schema:
schema = {
'definitions': {
'preference': {
'type': 'object',
'properties': {
'cooking': {
'type': 'boolean',
},
'fishing': {
'type': 'boolean',
},
},
'additionalProperties': False,
}
},
'type': 'array',
'items': {
'type': 'object',
'properties': {
'user_id': {
'type': 'integer',
},
'preference': {
'$ref': '#/definitions/preference',
},
}
}
}
The $ref
keyword uses #/definitions/preference
to reference the preference
under definitions
in the current JSON Schema document. Each level is separated by a slash (/
).
The pound symbol (
#
) refers to the current document, with subsequent slash-separated keys traversing the keys in the document’s objects.
P.S. This referencing method is known as JSON Pointer.
Mixing Object Within Object
Once you master definitions
and $ref
, you can easily combine various JSON structures. Here’s an example where an object contains another object:
user_schema = {
'definitions': {
'meta': {
'type': 'object',
'properties': {
'gender': {
'type': 'string',
},
'twitter': {
'type': 'string',
},
},
'additionalProperties': False,
}
},
'type': 'object',
'properties': {
'username': {
'type': 'string',
},
'password': {
'type': 'string',
},
'meta': {
'$ref': '#/definitions/meta',
}
},
'additionalProperties': False,
}
The example JSON data for validation can be seen below, where meta
contains another object:
{
'username': 'foo',
'password': 'bar',
'meta': {
'gender': 'male',
'twitter': 'foobar'
}
}
Mixing Array with Object
The combination of array
and object
is common in list-style data, such as search results, which may consist of an array
containing multiple objects
.
Here’s an example simulating search result JSON data:
results = [
{
'title': 'foo',
'link': 'https://example.com/',
'summary': 'this is summary',
'keywords': ['a', 'b', 'c']
},
{
'title': 'bar',
'link': 'https://example.com/',
'summary': 'this is summary',
'keywords': ['d', 'e', 'f']
},
]
With the data structure known, you can further define it in a JSON Schema, placing the structure of search results and keywords in definitions
:
schema = {
'definitions': {
'keywords': {
'type': 'array',
'items': {
'type': 'string'
}
},
'search_result': {
'type': 'object',
'properties': {
'title': {
'type': 'string'
},
'link': {
'type': 'string'
},
'summary': {
'type': 'string'
},
'keywords': {
'$ref': '#/definitions/keywords'
},
}
}
},
'type': 'array',
'items': {
'$ref': '#/definitions/search_result'
}
}
The above JSON Schema shows two separate definitions under definitions
—keywords
and search_result
. The keywords
property within each search_result
refers back to the definitions
’ keywords
.
The outermost 'type': 'array'
and 'items': {'$ref': '#/definitions/search_result'}
indicate that this JSON Schema represents an array of search_result
elements.
Choosing Between Multiple Formats (oneOf
)
JSON Schema provides keywords such as oneOf
, anyOf
, allOf
, and not
to flexibly validate JSON formats. Below is an example using the commonly used oneOf
. This JSON Schema accepts either an integer
or a string
:
schema = {
'oneOf': [
{'type': 'string'},
{'type': 'integer'},
]
}
Thus, any data that isn’t a string
or integer
will not pass validation:
import jsonschema
# pass
jsonschema.validate(123, schema)
# pass
jsonschema.validate('123', schema)
# failed
jsonschema.validate(['foo', 'bar'], schema)
This is how oneOf
is used; for the usage of anyOf
, allOf
, and not
, refer to Combining schemas.
Conclusion
As JSON remains a mainstream data exchange format, JSON Schema’s future development looks promising. It’s highly recommended to use JSON Schema for API format validation, reducing the cost of writing data validation code. Additionally, sharp-eyed readers might notice the similarity between JSON Schema and OpenAPI (Swagger). This is because OpenAPI takes inspiration from JSON Schema. Hence, mastering JSON Schema will also facilitate learning OpenAPI.
That’s all, Happy Coding!
References
http://json-schema.org/
https://spacetelescope.github.io/understanding-json-schema/index.html