Python Module - jsonschema Part 1

Posted on  Mar 20, 2018  in  Python Module/Package Recommendations  by  Amo Chen  ‐ 5 min read

This article is part of a tutorial series about the Python module - jsonschema:

JSON is currently one of the mainstream data interchange formats. However, if you want to validate the format of JSON data in your program, you’ll need to spend some effort writing validation code. Fortunately, JSON Schema simplifies the process of validating JSON formats. If you’re using JSON as the data exchange format for an API, you might consider using JSON Schema for validation.

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents.

This article serves as a tutorial on the Python package jsonschema and how it relates to validating JSON formats.

Currently, JSON Schema is still in draft stage (draft-07). For more details, please refer to the Specification. Note that the Python jsonschema module only supports draft-03 and draft-4 versions, so some newer syntax is not supported.

Environment for this Article

  • Python 3.5
  • jsonschema 2.6.0

Installing jsonschema

$ pip install jsonschema

Hello, jsonschema!

The Python jsonschema module validates data composed of Python types, not JSON strings. If you want to validate a JSON string, you need to first use json.loads() to convert it into Python data types.

Let’s go through an example to understand the overview of JSON Schema. Suppose you have a login API that uses JSON as the exchange format. The JSON format must include username and password for login purposes. When converted into a Python dictionary, it looks like this: {'username': 'foobar', 'password': '123456'}. We can define a schema for this format to validate it.

Here’s a schema with attributes username and password, followed by an example to validate a Python dictionary using validate().

from jsonschema import validate
schema = {
    'type': 'object',
    'properties': {
        'username': {
            'type': 'string',
            'minLength': 6,
        },
        'password': {
            'type': 'string',
            'minLength': 8,
        },
    }
}

validate({'username': 'foobar', 'password': '123456'}, schema)

Execution result:

ValidationError: '123456' is too short

Failed validating 'minLength' in schema['properties']['password']:
    {'minLength': 8, 'type': 'string'}

On instance['password']:
    '123456'

From the result, you can see that {'username': 'foobar', 'password': '123456'} failed the validation because the password length must be at least 8 characters.

This example gives a clear picture of how jsonschema validates data formats. We’ll cover how to define JSON Schema in more detail in the following sections.

Primitive Types

The type keyword in the first example’s schema indicates the data type. Defining a JSON Schema involves composing these data types.

The type keyword is fundamental to JSON Schema. It specifies the data type for a schema.

JSON Schema currently defines 6 data types, each of which corresponds to a Python data type, as shown below:

| JSON Schema | Python              |
|-----------------------------------|
| object      | dict                |
| array       | list, tuple         |
| string      | str                 |
| number      | int, float, decimal |
| boolean     | bool                |
| null        | None                |

null

null is the simplest type, corresponding only to None.

When validating the format, any type other than None will fail validation. Here’s an example:

from jsonschema import validate

schema = {'type': 'null'}

# would raise exception
validate(123, schema)

Execution result:

ValidationError: 123 is not of type 'null'

Failed validating 'type' in schema:
    {'type': 'null'}

On instance:
    123

In this example, replacing 123 with None will pass the validation.

boolean

boolean corresponds to Python’s True or False. Any values other than these will fail validation.

schema = {'type': 'boolean'}

# would raise exception
validate(None, schema)

Execution result:

ValidationError: None is not of type 'boolean'

Failed validating 'type' in schema:
    {'type': 'boolean'}

On instance:
    None

Similarly, replacing None with True or False will pass the validation.

number

number corresponds to Python’s int, float, decimal. If you input a string for validation, it will fail.

Example:

schema = {'type': 'number'}

# would raise exception
validate('123', schema)

Execution result:

ValidationError: '123' is not of type 'number'

Failed validating 'type' in schema:
    {'type': 'number'}

On instance:
    '123'

Simply replace '123' with 123, and it will work correctly.

string

string corresponds to Python’s str, including even the empty string '' as a valid string.

Example:

schema = {'type': 'string'}

# would pass
validate('', schema)

This example will pass validation. If you input a non-string value, it will fail.

array

array corresponds to Python’s list, tuple. Even an empty list or tuple is valid.

schema = {'type': 'array'}

# would pass
validate([1, 2], schema)

This example will also pass validation. If you replace [1, 2] with a non-list or tuple value, it will fail.

object

object is equivalent to Python’s dict (dictionary). An empty dictionary is also valid.

schema = {'type': 'object'}

# would pass
validate({'foo': 'bar'}, schema)

In this example, the dictionary {'foo': 'bar'} passes the JSON Schema validation. You can try inputting a non-dict type value to see what happens.

Summary

After going through the 6 JSON Schema data types, you should have a basic understanding of JSON Schema.

In the next article, we’ll cover how to combine these 6 types to create complex JSON format validations.

References

http://json-schema.org/

https://spacetelescope.github.io/understanding-json-schema/index.html