Back to previous page

Go JSON Cookbook

Recently I’ve gotten into answering Go questions on StackOverflow, and one of the patterns I noticed are many repetitive questions about JSON processing. The goal of this post is to collect a “cookbook” of JSON processing code and examples; think of it as a vastly expanded version of the JSON gobyexample page. It’s a living document – I will update it once in a while when I find new patterns/problems people ask about.

The code samples here should be reasonably self-contained; if you want actual code, full go run-able files are available here.

Some background on JSON

The acronym JSON stands for JavaScript Object Notation. It originally came into existence as a serialization format for JS, and can actually be considered a subset of JS. That said, it’s no longer considered good style to pass JSON to JS’s eval(); in fact, some valid JSON is not valid JS. Newer editions of the ECMAScript standard provide JSON.stringifyand JSON.parse for serialization and deserialization, respectively.

These days JSON is a very popular language-independent serialization format, with simple syntax that’s described here. In brief, JSON has values that can be either strings, numbers, null, true/false boolean constants, arrays or objects. Arrays are linear, ordered collections of values; in Go they are mapped to slices. Objects are unordered sets of key/value pairs; in Go they mapped to maps. JSON object keys are strings, and values are arbitrary JSON values. Note that this is a recursive definition – objects can hold other objects, or lists, which themselves can hold other objects, etc. This should be very familiar to programmers coming from dynamic languages like Python, Perl or JavaScript, or from the Lisp family where such nested data structures are common and idiomatic.

Marshaling of basic Go data types

Go uses the term marshaling to refer to the kind of serialization done by converting Go data structures to JSON [1]. Therefore, the json package’s main convenience functions are called json.Marshal and json.Unmarshal. These functions are generic, in the sense that they work with interface{} values, and they have the proper runtime logic to figure out which actual type is being (un)marshaled. Here is an example that uses the basic types:

This will print:

Note that while JSON doesn’t distinguish between integers and floating-point numbers, Go lets us do this by having the static type information on the objects passed into json.Marshal or json.Unmarshal. For more details on numbers see the “JSON numbers – ints or floats?” section.

Null and nil

Another basic type supported by JSON is null, which is just a way of saying “nothing to see here”. In Go, nil pointers are marshaled to null:

Prints null. Demonstrating unmarshaling of null is a little bit trickier, because it’s not clear what to pass to json.Unmarshal. Passing it any uninitialized pointer (nil) results in an error. A trick that works is:

Prints:

(Un)marshaling with generic interface{} values

As was briefly mentioned above, the definitions of json.Marshal and json.Unmarshal are generic, in the sense that they use an interface{} for the Go object which is to be encoded or decoded into:

The samples above showcased how special logic inside these functions handles basic types that are known at compile time. We can do similar things using interface{} directly. While not overly useful for these basic types, this knowledge will come handy when we’re discussing collection types (JSON representations of slices and maps) later on.

We saw earlier that to encode a boolean into JSON we can do:

Alternatively, we can do:

As far as json.Marshal is concerned, these two code samples are equivalent because the function really accepts a interface{}. It becomes a bit more interesting when unmarshaling. Where we previously did:

We can also do:

But now what do we do with ib? It’s not a bool, so we can’t treat it as such. Since we already know it should contain an bool, we can do a type assertion:

If we don’t know what type to expect here exactly, we’ll likely need a type switch:

JSON numbers – ints or floats?

One common issue with decoding JSON is distinguishing between different numeric types. Unlike Go, which has separate types for integers and floats, JavaScript only has floats; this fact is reflected in JSON. The JSON standard doesn’t acknowledge the distinction between the two, and treats both as “numbers”, though it’s clear from the specification that the more general type (floating point) is inferred.

As long as we know the type of field to decode statically, everything will work fine. As the very first example in this post demonstrates, when we pass a pointer to int to Unmarshal, it will know to parse properly into an integer. But what happens when we don’t know the type at compile time?

When using generic interface{} decoding, floating point is always chosen. Consider this:

It will print:

This is not a bug; it’s the logical thing for Unmarshal to do, given that it doesn’t know what type to expect. 1234 looks like an integer, but it might as well be a float with the decimal point omitted. Unmarshal has to decode the most general type based on the JSON specification.

If this is a real issue, one way to work around it is to use the alternative json.Decoder API for unmarshaling. This API is slightly different from json.Unmarhsal; it’s designed to parse JSON streams, which could result from reading over HTTP, for example. Here’s the same code using Decoder:

It gives a similar result to using Unmarshal. However, here’s the twist. Decoder has an option to not parse numbers into concrete types. Instead, a number be left unparsed as a json.Number, which is a just a string used to represent number literals. This is accomplished by calling Decoder.UseNumber(), as follows:

Now this will print:

And we’re free to parse the string as we wish, for example with strconv.Atoi.

You may think this is unnecessary – can’t we just convert the float64 read by Unmarshal into an int? Things are not so simple, however. Floating point numbers have limited representation accuracy, and for big integers we may get wrong results. We might even want to marshal arbitrary precision integers (big.Int), and these also have to be properly parsed to not lose precision.

JSON arrays and Go slices

Go slices are encoded to JSON arrays, and vice-versa:

Prints:

When we unmarshal the JSON bytes above, we used a []string since we knew all the elements of the JSON array are strings. But what happens when JSON array elements have different types? While in Go the types of all slice elements has to be the uniform, the same is not true in JSON (due to its JavaScript roots). Let’s try this:

This code panics:

The error message makes sense: we gave json.Unmarshal a []string to unmarshal into, but the JSON bytes contain a bool. We can’t unmarshal into an []bool, for a similar reason. So what is there to do?

This is where generic JSON comes in again. If we don’t know – ahead of time – the types of elements contained in a JSON array we’re unmarshaling, we have to fall back to generic interface{}s and type switches:

This outputs:

JSON objects and Go maps

Go maps are encoded to JSON objects, and vice-versa:

Prints:

Similarly to the case of slices, this works well as long as the types of JSON elements are known ahead of time. If these types are not known, or values can have one of several types, we’ll need to use generic capabilities, by unmarshaling into an interface{} and following up with type assertions or switches.

We’ve seen how to do these in the slice example, so let’s play with something slightly different now. JSON objects are sometimes nested, and we don’t even know at compile-time what their level of nesting is. Consider this sample JSON:

Say we want to find the key “fizz” in it, and to see what it maps to. How can we do that?

Let’s think this through. First, it’s obvious that we’ll have to use interface{} unmarshaling, because the values of keys in each object can have different types (some are booleans, some are nested objects). Second, since JSON is a tree structure, a recursive approach is natural. Here’s a function that will do that:

And here’s how we can use it:

Golang structs as JSON objects

Nested slices and maps are great, but in Go it’s idiomatic to assign more semantics to structured data with structs. Go’s json package supports marshaling structs into JSON objects and vice versa. Here’s a simple example:

It will print:

Here we’re using the MarshalIndent method that indents JSON output for easier visual scanning.

You’ll notice that the JSON objects have their key names set from the struct field names automatically. This is nice for low-effort dumping, but isn’t always satisfactory in real applications. We often don’t control the format of the JSON data we’re consuming, so we may get something like:

If we call json.Unmarshal for data like this, it will expect a struct with similarly named fields; but these names aren’t idiomatic in Go, and moreover they start with lowercase so they won’t even be visible outside the struct’s own methods. We have a problem – we either sacrifice the style of our Go code, or have to enforce schemas on JSON we don’t necessarily control.

The solution is to use custom field tags, which is an esoteric feature of Go that was designed specifically for such use cases. Here’s a complete example:

It prints:

Field tags let us map between the Go internal view of the struct’s fields and its external materialization as a JSON object.

These techniques work just as well with nested structs. Check out these code samples: sample 1sample 2.

Partial encoding and decoding of structs

It’s common for JSON data to omit some fields that are then assumed to not exist or take on their default values. Think about passing many options, where the full list of options is so large that a lot of time and bandwidth would be wasted to transfer them all fully; usually we only want to modify a small number of options for every given call.

The json package supports this with partial unmarshaling. Here’s an example using the Food struct shown above:

Note that the JSON string doesn’t have the Id field populated. The result will be:

The unmarshaling is successful, and fD.Id is left at the default value for its type (0 for numbers, empty string for strings, etc). This behavior can be controlled via the Decoder.DisallowUnknownFields method when using thejson.Decoder API.

For a similar effect during marshaling, we can use the special "omitempty" field tag; it tells the json package to not emit a struct field if it has the default value for its type. Here’s an example:

Note how the id field is left out of the JSON, because it was given the empty value 0.

We can also tell the encoder to omit certain fields. For example, we may have a struct with a field that should be kept private to the application and not sent over the wire. Even when the field has a non-empty value, we want it out of the serialized JSON string. We can do this with the json:"-" field tag:

Delayed parsing with json.RawMessage

Sometimes the data you need to parse is not on the top level of the JSON string, and/or you’d like to ignore a lot of the JSON contents, focusing just on the piece you need to parse. Consider this JSON string:

And suppose we’re interested only in the event key, as we already have a structure to parse it into:

How do we do that? The json module relies on static typing quite a bit, unless we go full generic with interface{}. But in that case, we may need to convert large maps into large structs manually, which is undesirable.

The solution is json.RawMessage, which exists for this purpose. It tells the json module to not parse some parts of the string and leave them as strings, which we can then parse again later. Here’s a complete solution to the issue discussed above:

JSON and pointer/reference types

The json package has special handling for pointer and reference types. Consider this sample structure:

We can marshal it as follows:

This will print:

This works because json.Marshal does the right thing here – it “sees through” the pointer to string and emits the string itself as the "Name" field. It works in reverse as well:

Note that when we create npD, its Name field is initialized with the default value – a nil for pointers. json.Unmarshalallocates an actual value and sets the pointer to its address when unmarshaling. If Name is not present in the JSON string being decoded, the pointer will be left as nil.

The same applies to other reference types, like slices:

This will print:

When we declare the variable bvD, its Vals field is an unallocated slice, but json.Unmarshal will allocate it for us if the Vals field is present in the decoded JSON object.

This behavior is very useful for being able to multiplex several struct types in a single container, implementing a sum type. Here’s a complete example:

This prints: