Python Module - jsonschema Part 3

Posted on  Mar 28, 2018  in  Python Module/Package Recommendations  by  Amo Chen  ‐ 4 min read

This post is part of a series on the Python module - jsonschema:

In Part 2 of the Python Module - jsonschema, we covered the complex usage of types like number, string, array, and object. However, most examples focused on validating individual data types, while in practice, JSON data can often involve a mixture of multiple data types. For example, an array might contain object elements, and an object might include nested objects. Consider the following JSON data:

[
    {
        "user_id": 1,
        "preference": {
            "cooking": True,
            "fishing": False,
        }
    },
    {
        "user_id": 1,
        "preference": {
            "cooking": True,
            "fishing": False,
        }
    },
]

This section will introduce how to write JSON Schemas that are more practical for real-world use and easier to maintain.

definitions and $ref

In different JSON data formats, common structures often appear. JSON Schema offers the definitions keyword to manage these shared JSON formats in one place, allowing any necessary JSON Schema to reference them via the $ref keyword.

For example, by placing the definition of preference into definitions, and referencing it using $ref in the JSON Schema, it simplifies the schema:

schema = {
    'definitions': {
        'preference': {
            'type': 'object',
            'properties': {
                'cooking': {
                    'type': 'boolean',
                },
                'fishing': {
                    'type': 'boolean',
                },
            },
            'additionalProperties': False,
        }
    },
    'type': 'array',
    'items': {
        'type': 'object',
        'properties': {
            'user_id': {
                'type': 'integer',
            },
            'preference': {
                '$ref': '#/definitions/preference',
            },
        }
    }
}

The $ref keyword uses #/definitions/preference to reference the preference under definitions in the current JSON Schema document. Each level is separated by a slash (/).

The pound symbol (#) refers to the current document, with subsequent slash-separated keys traversing the keys in the document’s objects.

P.S. This referencing method is known as JSON Pointer.

Mixing Object Within Object

Once you master definitions and $ref, you can easily combine various JSON structures. Here’s an example where an object contains another object:

user_schema = {
    'definitions': {
        'meta': {
            'type': 'object',
            'properties': {
                'gender': {
                    'type': 'string',
                },
                'twitter': {
                    'type': 'string',
                },
            },
            'additionalProperties': False,
        }
    },
    'type': 'object',
    'properties': {
        'username': {
            'type': 'string',
        },
        'password': {
            'type': 'string',
        },
        'meta': {
            '$ref': '#/definitions/meta',
        }
    },
    'additionalProperties': False,
}

The example JSON data for validation can be seen below, where meta contains another object:

{
    'username': 'foo',
    'password': 'bar',
    'meta': {
        'gender': 'male',
        'twitter': 'foobar'
    }
}

Mixing Array with Object

The combination of array and object is common in list-style data, such as search results, which may consist of an array containing multiple objects.

Here’s an example simulating search result JSON data:

results = [
    {
        'title': 'foo',
        'link': 'https://example.com/',
        'summary': 'this is summary',
        'keywords': ['a', 'b', 'c']
    },
    {
        'title': 'bar',
        'link': 'https://example.com/',
        'summary': 'this is summary',
        'keywords': ['d', 'e', 'f']
    },
]

With the data structure known, you can further define it in a JSON Schema, placing the structure of search results and keywords in definitions:

schema = {
    'definitions': {
        'keywords': {
            'type': 'array',
            'items': {
                'type': 'string'
            }
        },
        'search_result': {
            'type': 'object',
            'properties': {
                'title': {
                    'type': 'string'
                },
                'link': {
                    'type': 'string'
                },
                'summary': {
                    'type': 'string'
                },
                'keywords': {
                    '$ref': '#/definitions/keywords'
                },
            }
        }
    },
    'type': 'array',
    'items': {
        '$ref': '#/definitions/search_result'
    }
}

The above JSON Schema shows two separate definitions under definitionskeywords and search_result. The keywords property within each search_result refers back to the definitionskeywords.

The outermost 'type': 'array' and 'items': {'$ref': '#/definitions/search_result'} indicate that this JSON Schema represents an array of search_result elements.

Choosing Between Multiple Formats (oneOf)

JSON Schema provides keywords such as oneOf, anyOf, allOf, and not to flexibly validate JSON formats. Below is an example using the commonly used oneOf. This JSON Schema accepts either an integer or a string:

schema = {
    'oneOf': [
        {'type': 'string'},
        {'type': 'integer'},
    ]
}

Thus, any data that isn’t a string or integer will not pass validation:

import jsonschema

# pass
jsonschema.validate(123, schema)

# pass
jsonschema.validate('123', schema)

# failed
jsonschema.validate(['foo', 'bar'], schema)

This is how oneOf is used; for the usage of anyOf, allOf, and not, refer to Combining schemas.

Conclusion

As JSON remains a mainstream data exchange format, JSON Schema’s future development looks promising. It’s highly recommended to use JSON Schema for API format validation, reducing the cost of writing data validation code. Additionally, sharp-eyed readers might notice the similarity between JSON Schema and OpenAPI (Swagger). This is because OpenAPI takes inspiration from JSON Schema. Hence, mastering JSON Schema will also facilitate learning OpenAPI.

That’s all, Happy Coding!

References

http://json-schema.org/

https://spacetelescope.github.io/understanding-json-schema/index.html