JSON Schemas in Go

In 2024 I did some work on a project at Google (the Go version of Genkit) that used JSON schemas.

JSON schemas let programs specify how JSON data should be structured. Basically, you can say things like ‘this JSON data must have a field “name” which is a string and a field “age” which is a number.’ Of course you can get much more complicated than that. JSON schemas can themselves be represented as JSON values, and there is a metaschema that verifies whether a JSON value is a valid JSON schema.

For Genkit we needed Go code that could

  • Read the JSON form of a JSON schema
  • Construct a JSON schema by hand
  • Construct a JSON schema that described a Go struct type
  • Validate a JSON value against a JSON schema

The overall Genkit system could use JSON schemas to help people enter data in the expected format, although that was implemented in TypeScript, not Go.

At the time we couldn’t find one Go package that could do all of those operations, so we used a couple of packages (github.com/invopop/jsonschema and github.com/xeipuuv/gojsonschema). We converted between their data structures as needed by marshaling one implementation to JSON and unmarshaling into the other implementation.

Since then Jonathan Amsterdam and others at Google have written a package that does everything necessary: github.com/google/jsonschema-go. However, at the time we didn’t have that.

JSON schemas are moderately complex, with multiple drafts of the specification, support for cross-referencing within a schema and to external schemas, and checks like “every field that wasn’t explicitly mentioned must satisfy this subschema.” The specification is somewhat abstract and seems to have evolved over time as people have found interesting uses for schemas, especially when working in a dynamic language like JavaScript.

I’ve always enjoyed the implementation of complex specifications, and it’s led me to side projects like demangling C++ identifiers (github.com/ianlancetaylor/demangle in Go and the first draft of the current GCC demangler in C) and doing stack backtraces in C code (github.com/ianlancetaylor/libbacktrace, which had to be async signal safe and required implementing three (3) different decompression algorithms).

So I started working on a JSON schema implementation on the side. It took a while, but I finally published it at github.com/ianlancetaylor/jsonschema.

Rather than implementing JSON schemas as a struct, as the other Go implementations do, JSON schemas are represented as a slice of keyword/value pairs. This is somewhat more space efficient, which doesn’t matter much. It is also more efficient at validating JSON objects: rather than checking every field of the JSON schema struct, it only has to validate the keywords that are actually specified.

Not using a single struct also makes it easier to implement multiple drafts of the JSON schema. The current implementation supports draft-07, draft2019-09, and draft2020-12. Adding support for more drafts should be straightforward.

Now that I’ve written this package, I have no particular use for it. If other people find it useful, I’ll fix bugs and tweak the API for usability. In particular it’s a bit awkward to change any aspect of an existing JSON schema in Go code, which might be useful for some applications. It also doesn’t have full support for JSON schema annotations, as it’s not clear to me that they are useful in Go. And I’m sure there are a number of other infelicities.

Let me know if you find this package useful. Happy hacking.


Posted

in

by

Tags:

Comments

Leave a Reply