Introduction to the Python Package - JMESPath (A JSON Query Language Similar to jq)

Posted on Jul 12, 2023 in Python Module Recommendations by Amo Chen ‐ 5 min read

JMESPath is a Python package with functions similar to jq, enabling Python developers to query and restructure JSON data with a syntax akin to jq. This is done by converting JSON into Python’s native data types using the json module. Proper use of JMESPath can simplify code and enhance readability.

This article will introduce the methods for using JMESPath.

Environment for This Article

Python 3
JMESPath

$ pip install jmespath

Common Methods for Reading Nested Dictionary Data

Suppose we have a nested dictionary variable data, characterized by its numerous nested layers, specifically a > b > c:

data = {
    'a': {
        'b': {
            'c': 'foo',
        }
    }
}

In this scenario, to access the data stored in c, we typically use a method of chaining multiple dictionaries with get() to fetch the value of c layer by layer:

c = data.get('a', {}).get('b', {}).get('c')
if c is None:
    print('Not found')
else:
    print(c)

The output of the above example is foo.

This method, however, could cause readability issues as the data becomes more complex.

A more elegant approach, like using jq or lodash, would allow querying with something like data.get('a.b.c') to directly get the value of c. Unfortunately, Python does not offer this convenient feature natively.

Thankfully, JMESPath provides us with this simple functionality.

JMESPath

JMESPath not only supports Python but also other languages like Go, PHP, and Rust. Once you master JMESPath, you can comfortably use the same syntax in these languages.

Using the same nested data example:

data = {
    'a': {
        'b': {
            'c': 'foo',
        }
    }
}

With JMESPath, you can rewrite it as follows. jmespath.compile('a.b.c') works similarly to the regex re module, checking if the syntax a.b.c is correct and converting it into a JMESPath expression instance. This instance is then used to query the data:

import jmespath

expr = jmespath.compile('a.b.c')
c = expr.search(data)
if c is None:
    print('Not found')
else:
    print(c)

This produces the same output as the previous example, but isn’t it much more elegant?

Python’s Index/Slice Syntax Support

For list-type data, JMESPath supports the same index syntax as Python. Consider this data:

data = {
    'a': {
        'b': {
            'c': [1, 2, 3, 4, 5, 6, 7, 8],
        }
    }
}

To fetch the last value in c, you can use a.b.c[-1]:

import jmespath

expr = jmespath.compile('a.b.c[-1]')
c = expr.search(data)
if c is None:
    print('Not found')
else:
    print(c)

The output is:

Beyond the index syntax, slicing is also supported. For example, a.b.c[0:2] retrieves from index 0 to 1 (exclusive of 2):

import jmespath

expr = jmespath.compile('a.b.c[0:2]')
c = expr.search(data)
if c is None:
    print('Not found')
else:
    print(c)

The output is:

[1, 2]

Python’s reverse indexing is perfectly fine too. Check out [::-1] below:

import jmespath

expr = jmespath.compile('a.b.c[::-1]')
c = expr.search(data)
if c is None:
    print('Not found')
else:
    print(c)

The output shows the reversed c list:

[8, 7, 6, 5, 4, 3, 2, 1]

Filter Usage

JMESPath supports filtering syntax similar to jq. Take this data as an example:

servers = [
    {"host": "a", "state": "up"},
    {"host": "b", "state": "down"},
    {"host": "c", "state": "up"},
]

If we want to filter out host data where state is down, the following Python code can achieve that:

for s in servers:
    if s['state'] == 'down':
        print(s['host'])

Alternatively, with JMESPath, you can use [?state=='down'].host, where [?<expression>] signifies the filter, and state=='down' looks for data where state equals down:

import jmespath

expr = jmespath.compile("[?state=='down'].host")
bad_servers = expr.search(servers)
if bad_servers is None:
    print('No bad servers')
else:
    print(bad_servers)

The output is:

['b']

To chain multiple filter conditions, use && or || for AND and OR, respectively. For example, [?state=='down' && host == 'b'] filters data where state == 'down' AND host == 'b'.

Selecting Multiple Fields

Consider the following data. Suppose we want to list only name and gender fields for users:

data = {
    'users': [
        {'id': 1, 'name': 'a', 'gender': 'f'},
        {'id': 2, 'name': 'b', 'gender': 'm'},
        {'id': 3, 'name': 'c', 'gender': 'f'},
    ]
}

Using JMESPath, you can write users[*][name, gender] to extract name and gender from all users:

import jmespath

expr = jmespath.compile("users[*][name, gender]")
users = expr.search(data)
print(users)

The output is a list of name and gender:

[['a', 'f'], ['b', 'm'], ['c', 'f']]

You can unpack this result with Python code like:

for name, gender in users:
    pass

To keep the field names, you need to use a projection syntax for output.

Projection Usage

Using the earlier data example. If you want to preserve field names in the output, change users[*][name, gender] to users[*].{Name: name, Gender: gender}, where {Name: name, Gender: gender} is the projection. This maps name and gender values to two new fields, Name and Gender:

import jmespath

expr = jmespath.compile("users[*].{Name: name, Gender: gender}")
users = expr.search(data)
print(users)

The output is:

[
    {"Name": "a", "Gender": "f"},
    {"Name": "b", "Gender": "m"},
    {"Name": "c", "Gender": "f"},
]

To keep the original field names, simply map them in the same way, e.g., {name: name, gender: gender}.

Conclusion

JMESPath offers many features not covered in this article, such as sorting, built-in functions, and piping. This article focuses on commonly used, intuitive features because JMESPath’s syntax and jq are not standardized. Unless decided as a primary development tool within a team, overly complex syntax could reduce readability compared to Python code.

Ultimately, when using packages that require additional syntactical learning, you should assess their effectiveness in enhancing readability and maintainability rather than increasing the learning curve for team members and code complexity.

Happy Coding!

References

JMESPath — JMESPath

https://github.com/jmespath/jmespath.py

python json

Introduction to the Python Package - JMESPath (A JSON Query Language Similar to jq)

Environment for This Article #

Common Methods for Reading Nested Dictionary Data #

JMESPath #

Python’s Index/Slice Syntax Support #

Filter Usage #

Selecting Multiple Fields #

Projection Usage #

Conclusion #

References #

Related posts

Python SQLAlchemy DISTINCT Example

Python Module - jsonschema Part 3

Python Module - jsonschema Part 2

Environment for This Article

Common Methods for Reading Nested Dictionary Data

JMESPath

Python’s Index/Slice Syntax Support

Filter Usage

Selecting Multiple Fields

Projection Usage

Conclusion

References