Introduction to the Python Package - JMESPath (A JSON Query Language Similar to jq)
Posted on Jul 12, 2023 in Python Module/Package Recommendations by Amo Chen ‐ 5 min read
JMESPath is a Python package with functions similar to jq, enabling Python developers to query and restructure JSON data with a syntax akin to jq. This is done by converting JSON into Python’s native data types using the json module. Proper use of JMESPath can simplify code and enhance readability.
This article will introduce the methods for using JMESPath.
Environment for This Article
- Python 3
- JMESPath
$ pip install jmespath
Common Methods for Reading Nested Dictionary Data
Suppose we have a nested dictionary variable data
, characterized by its numerous nested layers, specifically a > b > c
:
data = {
'a': {
'b': {
'c': 'foo',
}
}
}
In this scenario, to access the data stored in c
, we typically use a method of chaining multiple dictionaries with get() to fetch the value of c
layer by layer:
c = data.get('a', {}).get('b', {}).get('c')
if c is None:
print('Not found')
else:
print(c)
The output of the above example is foo
.
This method, however, could cause readability issues as the data becomes more complex.
A more elegant approach, like using jq or lodash, would allow querying with something like data.get('a.b.c')
to directly get the value of c
. Unfortunately, Python does not offer this convenient feature natively.
Thankfully, JMESPath provides us with this simple functionality.
JMESPath
JMESPath not only supports Python but also other languages like Go, PHP, and Rust. Once you master JMESPath, you can comfortably use the same syntax in these languages.
Using the same nested data example:
data = {
'a': {
'b': {
'c': 'foo',
}
}
}
With JMESPath, you can rewrite it as follows. jmespath.compile('a.b.c')
works similarly to the regex re
module, checking if the syntax a.b.c
is correct and converting it into a JMESPath expression instance. This instance is then used to query the data
:
import jmespath
expr = jmespath.compile('a.b.c')
c = expr.search(data)
if c is None:
print('Not found')
else:
print(c)
This produces the same output as the previous example, but isn’t it much more elegant?
Python’s Index/Slice Syntax Support
For list-type data, JMESPath supports the same index syntax as Python. Consider this data:
data = {
'a': {
'b': {
'c': [1, 2, 3, 4, 5, 6, 7, 8],
}
}
}
To fetch the last value in c
, you can use a.b.c[-1]
:
import jmespath
expr = jmespath.compile('a.b.c[-1]')
c = expr.search(data)
if c is None:
print('Not found')
else:
print(c)
The output is:
8
Beyond the index syntax, slicing is also supported. For example, a.b.c[0:2]
retrieves from index 0
to 1
(exclusive of 2
):
import jmespath
expr = jmespath.compile('a.b.c[0:2]')
c = expr.search(data)
if c is None:
print('Not found')
else:
print(c)
The output is:
[1, 2]
Python’s reverse indexing is perfectly fine too. Check out [::-1]
below:
import jmespath
expr = jmespath.compile('a.b.c[::-1]')
c = expr.search(data)
if c is None:
print('Not found')
else:
print(c)
The output shows the reversed c
list:
[8, 7, 6, 5, 4, 3, 2, 1]
Filter Usage
JMESPath supports filtering syntax similar to jq. Take this data as an example:
servers = [
{"host": "a", "state": "up"},
{"host": "b", "state": "down"},
{"host": "c", "state": "up"},
]
If we want to filter out host
data where state
is down
, the following Python code can achieve that:
for s in servers:
if s['state'] == 'down':
print(s['host'])
Alternatively, with JMESPath, you can use [?state=='down'].host
, where [?<expression>]
signifies the filter, and state=='down'
looks for data where state
equals down
:
import jmespath
expr = jmespath.compile("[?state=='down'].host")
bad_servers = expr.search(servers)
if bad_servers is None:
print('No bad servers')
else:
print(bad_servers)
The output is:
['b']
To chain multiple filter conditions, use &&
or ||
for AND
and OR
, respectively. For example, [?state=='down' && host == 'b']
filters data where state == 'down' AND host == 'b'
.
Selecting Multiple Fields
Consider the following data. Suppose we want to list only name
and gender
fields for users
:
data = {
'users': [
{'id': 1, 'name': 'a', 'gender': 'f'},
{'id': 2, 'name': 'b', 'gender': 'm'},
{'id': 3, 'name': 'c', 'gender': 'f'},
]
}
Using JMESPath, you can write users[*][name, gender]
to extract name
and gender
from all users:
import jmespath
expr = jmespath.compile("users[*][name, gender]")
users = expr.search(data)
print(users)
The output is a list of name
and gender
:
[['a', 'f'], ['b', 'm'], ['c', 'f']]
You can unpack this result with Python code like:
for name, gender in users:
pass
To keep the field names, you need to use a projection syntax for output.
Projection Usage
Using the earlier data example. If you want to preserve field names in the output, change users[*][name, gender]
to users[*].{Name: name, Gender: gender}
, where {Name: name, Gender: gender}
is the projection. This maps name
and gender
values to two new fields, Name
and Gender
:
import jmespath
expr = jmespath.compile("users[*].{Name: name, Gender: gender}")
users = expr.search(data)
print(users)
The output is:
[
{"Name": "a", "Gender": "f"},
{"Name": "b", "Gender": "m"},
{"Name": "c", "Gender": "f"},
]
To keep the original field names, simply map them in the same way, e.g., {name: name, gender: gender}
.
Conclusion
JMESPath offers many features not covered in this article, such as sorting, built-in functions, and piping. This article focuses on commonly used, intuitive features because JMESPath’s syntax and jq are not standardized. Unless decided as a primary development tool within a team, overly complex syntax could reduce readability compared to Python code.
Ultimately, when using packages that require additional syntactical learning, you should assess their effectiveness in enhancing readability and maintainability rather than increasing the learning curve for team members and code complexity.
Happy Coding!
References
https://github.com/jmespath/jmespath.py