From Go to Rust - JSON and YAML

Jun 12 2018

One of Go's big selling points for me was its novel approach to JSON encoding. Learning about Rust's encoding has made me even more excited. In this post, we'll start with Go's JSON encoder, and then see how Rust does encoding. And we'll even through in some YAML!

In the first in this series, we looked at a simple Go program, and then converted it to Rust.

In the second part, we created a simple REST server in Go, then re-implemented it in Rust.

This time, we'll focus on encoding structured data into common formats. Then we'll quickly look at decoding data as well.

Along the way, we'll get introduced to some new Rust concepts:

  • Development dependencies
  • Vectors (the built-in list type)
  • Creating structs
  • Using attributes
  • Basic file I/O

As I've said in the previous posts, I'm still a Rust learner. I've done my best to link to more authoritative information in this and the last posts. And I apologize at the outset for my mistakes.

Go, Struct Tags, and JSON

Go provides a built-in encoding/json library for working with JSON data. We can take a quick look at this in action by writing a struct and then serializing it in Go:

package main

import (
    "encoding/json"
    "fmt"
    "os"
)

type Person struct {
    FirstName  string
    MiddleName string
    LastName   string
    Aliases    []string
}

func main() {

    person := Person{
        FirstName: "Alfred",
        LastName:  "Tennyson",
        Aliases: []string{
            "Alfie",
            "Tenny",
        },
    }

    data, err := json.Marshal(person)
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
    }

    os.Stdout.Write(data)
}

We can use go run to run the above, whose output will be:

{"FirstName":"Alfred","MiddleName":"","LastName":"Tennyson","Aliases":["Alfie","Tenny"]}

One More Pass over the Go Program

There are three things we'd like to change about the above, though:

  1. Most of the time we don't want our JSON keys to have initial capitals
  2. We want to omit MiddleName if it's empty
  3. We want to format the output to be pretty

We can accomplish the first two with struct tags (aka annotations). The last one we can fix with a call to json.MarshalIndent.

package main

import (
    "encoding/json"
    "fmt"
    "os"
)

type Person struct {
    FirstName  string   `json:"firstName"`
    MiddleName string   `json:"middleName,omitempty"`
    LastName   string   `json:"lastName"`
    Aliases    []string `json:"aliases"`
}

func main() {

    person := Person{
        FirstName: "Alfred",
        LastName:  "Tennyson",
        Aliases: []string{
            "Alfie",
            "Tenny",
        },
    }

    data, err := json.MarshalIndent(person, "", "  ")
    if err != nil {
        fmt.Fprintln(os.Stderr, err)
    }

    os.Stdout.Write(data)
}

We added a handful of struct annotations like this:

    MiddleName string   `json:"middleName,omitempty"`

And we changed our json.Marshal call to this:

json.MarshalIndent(person, "", "  ")

And now, our output is:

{
  "firstName": "Alfred",
  "lastName": "Tennyson",
  "aliases": [
    "Alfie",
    "Tenny"
  ]
}

For those less familiar with Go, you may want to check out the official Go JSON docs to learn more about the code above.

With this basic tour of Go's JSON features, we can now switch over and look at Rust's.

Simple JSON Serializing in Rust

Let's recreate the program above in Rust. As with our previous posts, we'll kick things off by creating a new project using cargo, and then adding some libraries:

$ cargo new --bin hello_serde

In our Cargo.toml, we'll want to add some of the serde (SERialize/DEserialize) Rust libraries to the [dependencies] section:

[dependencies]
serde = "1.0"
serde_json = "1.0"
serde_yaml = "0.7"
serde_derive = "1.0"

A learning moment for me: As with NPM (JavaScript), Glide (Go), and Composer (PHP), it is possible to declare that some dependencies are only needed for development (dev-dependencies). Originally, I put serde_derive in this section. But upon reading the documentation, I discovered that this section is really only for dependencies for "tests, examples, and benchmarks".

Okay, now we're ready to start coding up our src/main.rs. We'll start by creating the struct, instantiating it, then printing it:

extern crate serde_json;

#[derive(Debug)]
struct Person {
    first_name: String,
    middle_name: String,
    last_name: String,
    aliases: Vec<String>,
}

fn main() {
    let person = Person{
        first_name: String::from("Alfred"),
        middle_name: String::from("Clayton"),
        last_name: String::from("Tennyson"),
        aliases: vec![
            String::from("Alfie"),
            String::from("Tenny"),
        ],
    };
    println!("{:?}", person)
}

The struct is straightforward, looking remarkably like a Go struct (with a few minor syntax variations). There are a few things to note about the choices I made, though:

  • Rust has more than one string type, I chose String because (in the Rust ownership model) I want the calling code to be the owner. In a later post, I'll come back to lifetimes and ownership.
    • When we create a String, we have to use String::from("literal")
  • In Rust, as in Go, an array is not growable. In Go, we use a slice as a growable data type. In Rust, we use a vector (Vec<T>), which is a generic growable list.

So while Go uses []string (slice of strings), Rust uses Vec<String> (vector of Strings).

In the main() function, we created a new Person instance almost the same way we would do so in Go. Then we printed the resulting struct to STDOUT using the println!() macro that we've seen throughout this series.

Looking closer at that println!(), once again we are using the {:?} formatter to dump debugging output. But in order to make that work, we had to do something special to the struct.

Attributes

You may have noticed that the struct was preceded by a derive attribute.

#[derive(Debug)]

While we've seen attributes in previous posts, let's talk about them a little bit here.

In our previous Go example, we saw how structs could be tagged with backtick-enclosed annotations. You may also be familiar with Go's build tags, where build instructions can be embedded in a comment line at the top of a file.

Rust provides a more general facility that can be used for these sorts of preprocessing, but Rust's version is both more generic and more flexible. And in today's project, we'll be showing a few examples of this flexibility.

The #[derive] attribute is a commonly used code generation annotation. In Go terms, you can think of it as a tool for automatically implementing certain interfaces by calling out to a generator during compile time.

Go supports generators via go generate and a build-style annotation. But it is not flexible enough to implement the kinds of code generating that Rust does because the system for marking up generators is unsophisticated.

In our case, in order to use println!("{:?}", person), we need to provide an implementation of the Debug trait. We don't want to do anything fancy. We just want a straight dump of the struct. And Rust can create that for us. We just annotate the struct:

#[derive(Debug)]

At compile time, Rust implements the Debug trait for us, and we don't even have to see the generated code.

This leads us handily to the next thing we want to do: generate JSON output.

Marking a Struct for JSON

Go's encoding/json (and, in fact, most Go encoders) uses runtime reflection to serialize data. In the case of our struct example, it reflects on the type of the value we pass it, figures out the fields, then reflects over the values and generates a JSON representation.

The Rust Serde approach to encoding is a little different: If a type is fully defined at compile time (which it is in both Go and Rust), then it is possible to derive a serializer at compile time, rather than use runtime reflection. In theory, this should be both safer (since the compiler can perform all of its checks) and faster (since there is no reflection).

Serde provides generator macros for serializers and deserializers, and we can use the #[derive] attribute to request that it generates those for us.

extern crate serde_json;

#[macro_use]
extern crate serde_derive;

#[derive(Debug, Serialize, Deserialize)]
struct Person {
    first_name: String,
    middle_name: String,
    last_name: String,
    aliases: Vec<String>,
}

fn main() {
    let person = Person{
        first_name: String::from("Alfred"),
        middle_name: String::from("Clayton"),
        last_name: String::from("Tennyson"),
        aliases: vec![
            String::from("Alfie"),
            String::from("Tenny"),
        ],
    };
    let json = serde_json::to_string(&person);
    println!("{}",  json.unwrap())
}

We did two things to make the Serde generator work:

  • We used the #[macro_use] extern crate serde_derive statement to import the macros for deriving serializers and deserializers
  • We modified our #[derive] attribute to generate three things for us: #[derive(Debug, Serialize, Deserialize)

Because of these two changes, when the code is compiled, Serde will generate serializer and deserializer traits for our Person. To serialize our person, we just do this:

let json = serde_json::to_string(&person);
println!("{}",  json.unwrap())

Running the above will produce:

{"first_name":"Alfred","middle_name":"Clayton","last_name":"Tennyson","aliases":["Alfie","Tenny"]}

It's important to note one the separation of concerns going on in the Serde system:

  • serde provides the general serializer/deserializer (marshal/unmarshal) system. It is, we might say, format agnostic.
  • serde_derive is a crate that provides generators specifically for creating the format-neutral serializers and deserializers.
  • serde_json is the crate whose responsibility is to handle the mechanics of the JSON format.

This division can easily be seen when we add a second format to our code.

Bonus Section: YAML!

Because Serde is a general purpose encoding library, we can swap out our JSON for YAML very simply: Do a Find and Replace, swapping yaml for json:

extern crate serde_yaml;

#[macro_use]
extern crate serde_derive;

#[derive(Debug, Serialize, Deserialize)]
struct Person {
    first_name: String,
    middle_name: String,
    last_name: String,
    aliases: Vec<String>,
}

fn main() {
    let person = Person{
        first_name: String::from("Alfred"),
        middle_name: String::from("Clayton"),
        last_name: String::from("Tennyson"),
        aliases: vec![
            String::from("Alfie"),
            String::from("Tenny"),
        ],
    };

    let yaml = serde_yaml::to_string(&person);
    println!("{}",  yaml.unwrap())
}

It should have swapped the reference to serde_json with references to serde_yaml. In other words, we just swapped out the library that does the formatting.

To be clear, there is no reason why you cannot have both the JSON and YAML serializers in the same code. I just thought the find-and-replace swap was a neat way to demonstrate the mechanics.

If you cargo run that, you'll see this:

---
first_name: Alfred
middle_name: Clayton
last_name: Tennyson
aliases:
  - Alfie
  - Tenny

Because Serde is a general framework, we didn't have to change anything about the serializer/deserializer generation.

Switch your code back to using JSON, since that's the format we're currently interested in working with.

Fixing Output

When we first serialized our Go struct, the field names weren't what we wanted. We had leading capitals (FirstName) instead of camel case (firstName).

Likewise, in Rust we now have snake_case names. How do we change those? Once again, the answer is annotations:

#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all="camelCase")]
struct Person {
    first_name: String,
    middle_name: String,
    last_name: String,
    aliases: Vec<String>,
}

This time, we use the serde attribute to tell Serde that we want all of the field names on the struct to be rewritten in camel case. This generates the following:

{"firstName":"Alfred","middleName":"Clayton","lastName":"Tennyson","aliases":["Alfie","Tenny"]}

Serde defines dozens of attributes for serializing and deserializing.

In our second iteration of the Go program, we formatted the output. This, too, can be easily done with the serde_json library in the same way it is done with Go: by switching out the marshal (serialize) function:

let json = serde_json::to_string_pretty(&person);

And now we have:

{
  "firstName": "Alfred",
  "middleName": "Clayton",
  "lastName": "Tennyson",
  "aliases": [
    "Alfie",
    "Tenny"
  ]
}

Things have been moving along well, but now we're about to hit a snag. There's been a field present in our Rust person that was not there in the Go person: the middle name. Getting rid of it will teach us more about Rust.

Rust Just Doesn't Have One of Those

There is something that Go has that Rust simply does not: nil. It doesn't have a null, either, nor can it produce null pointer errors.

Not having a nil/null is a pleasant experience most of the time. The compiler catches uninitialized fields, which is nice. There's never a need to do null checks, which is nice. And a whole nasty family of runtime exceptions just don't happen, which is nice.

But then there are some complications. If we edit our program by just commenting out this one line, we can see the result:

let person = Person{
    first_name: String::from("Alfred"),
    // middle_name: String::from("Clayton"),
    last_name: String::from("Tennyson"),
    aliases: vec![
        String::from("Alfie"),
        String::from("Tenny"),
    ],
};

Alfred, Lord Tennyson doesn't appear to have had a middle name [citation needed], so we should comment that out. But now if we run this code...

$ cargo run
   Compiling hello_serde v0.1.0 (file:///Users/mbutcher/Code/Rust/hello_serde)
error[E0063]: missing field `middle_name` in initializer of `Person`
  --> src/main.rs:16:18
   |
16 |     let person = Person{
   |                  ^^^^^^ missing `middle_name`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0063`.
error: Could not compile `hello_serde`.

To learn more, run the command again with --verbose.

We cannot leave middle_name uninitialized. The program simply won't compile. This is definitely suboptimal. What we want is to tell Rust that this is optional, and that we might have some value or we might not.

This require's reworking our struct slightly:

extern crate serde_json;

#[macro_use]
extern crate serde_derive;

#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all="camelCase")]
struct Person {
    first_name: String,
    middle_name: Option<String>,
    last_name: String,
    aliases: Vec<String>,
}

fn main() {
    let person = Person{
        first_name: String::from("Alfred"),
        middle_name: None,
        last_name: String::from("Tennyson"),
        aliases: vec![
            String::from("Alfie"),
            String::from("Tenny"),
        ],
    };

    let json = serde_json::to_string_pretty(&person);
    println!("{}",  json.unwrap())
}

Now we have marked middle_name as Option<String>. An Option<T> (where T is a type) can have one of two values: It can have a Some(t) or a None.

Since we don't want a middle_name for Tennyson, when we create our struct literal later, we set middle_name: None.

And now, it is utterly without irony that we shall run the program and see...

{
  "firstName": "Alfred",
  "middleName": null,
  "lastName": "Tennyson",
  "aliases": [
    "Alfie",
    "Tenny"
  ]
}

Yup, that's right! Our middleName field's None value was converted to a JSON null. (And were we to deserialize the JSON, it would be imported back in as a None.)

But we don't want a null in our JSON. We just want the field to be omitted. Go allowed us to use the json:",omitempty" modifier. Rust gives us a similar method using an attribute:

struct Person {
    first_name: String,

    #[serde(skip_serializing_if="Option::is_none")]
    middle_name: Option<String>,

    last_name: String,
    aliases: Vec<String>,
}

Rust has a less strict notion of an "empty value" than Go does. So when we tell it to omit an empty value, we have to tell it under what conditions to omit a value. In our example above, we tell it to skip serializing if the function Option::is_none() returns true for middle_name.

As a further example, if we wanted to omit aliases when when the vector was empty, we could use #[serde(skip_serializing_if="Vec::is_empty")].

At this point, our Go and Rust programs are both producing the same output. As one last exercise, let's see if we can modify the program to read a JSON file into a Person, and then print that output.

Reading and Parsing a File

This version of the code is going to be a little shorter. Let's save the output of our JSON file to a file called person.json in the project's root directory (next to Cargo.toml).

{
  "firstName": "Alfred",
  "lastName": "Tennyson",
  "aliases": [
    "Alfie",
    "Tenny"
  ]
}

Now let's change around our program to read that file and deserialize it. Since we're at the end of a long article, we're going to be lazy about our error handling and use unwrap() a few times, which means our program will exit (via panic) if anything goes wrong.

extern crate serde_json;

#[macro_use]
extern crate serde_derive;

use std::fs::File;

#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all="camelCase")]
struct Person {
    first_name: String,
    #[serde(skip_serializing_if="Option::is_none")]
    middle_name: Option<String>,
    last_name: String,
    aliases: Vec<String>,
}


fn main() {
    let f = File::open("person.json").unwrap();
    let person: Person = serde_json::from_reader(f).unwrap();
    println!("{:?}", person)
}

To read the file, we're useing std::fs::File. Just as Go's os.File is a reader, so is Rust's. So we can pass that file handle directly to serde_json::from_reader(). As we saw in the previous article, since we declare that person is a Person on the left side, Rust will read the JSON file into a Person. (Contrast this with Go's Unmarshal, where we have to pass the reference into the Unmarshal function so that it can reflect and determine where to put the data).

From this point, we have a Person object. In the last line of our main() function, we just use the println!() macro to output the debugging version of the person object.

The result we get will be:

Person { first_name: "Alfred", middle_name: None, last_name: "Tennyson", aliases: ["Alfie", "Tenny"] }

We can see here that even though the middle_name field was not present in our person.json input, it was deserialized into a None to fulfill the requirements of the Option<String>.

Conclusion

In this third part of the "From Go to Rust" series, we compared Go's encoding/json to Rust's serde_json (and family). Along the way, we got a much deeper look at attributes -- especially serde's attributes. We also took a look at vectors and options, two core pieces of Rust's core. We created our first struct, and also did our first bit of file I/O. We even took a quick look at the serde_yaml library to see how Serde provides a common decoding and encoding framework.