JSON Validation

Learn more about the concept of JSON schema and its validation.

We'll cover the following

Overview

We noted that our mypy type hint doesn’t really guarantee the JSON document is—in any way—what we expected. There is a package in the Python Package Index that can be used for this. The jsonschema package lets us provide a specification for a JSON document, and then confirm whether or not the document meets the specification.

The JSON Schema validation is a runtime check, unlike the mypy type hint. This means using validation will make our program slower. It can also help to diagnose subtly incorrect JSON documents. For details, visit https://json-schema.org. This is evolving toward standardization, and there are several versions of compliance checking available.

We’ll focus on newline-delimited JSON. This means we need a schema for each sample document within the larger collection of documents. This kind of additional validation might be relevant when receiving a batch of unknown samples to classify. Before doing anything, we’d like to be sure the sample document has the right attributes.

A JSON Schema document is also written in JSON. It includes some metadata to help clarify the purpose and meaning of the document. It’s often a little easier to create a Python dictionary with the JSON Schema definition.

Example

Here’s a candidate definition for the Iris schema for an individual sample:

Get hands-on with 1400+ tech skills courses.