Efficiently deserialize polymorphic JSON maps in Rust

48 Views Asked by At

I have a JSON with a structure like this:

[
  {
    "id": 123,
    "type": "a",
    "field1": "...",
    ...
  },
  {
    "id": 456,
    "type": "b",
    "field2": "...",
    ...
  },
  ...
]

I want to deserialize it to a structure where I store elements with different types in different maps. I managed to do this using serde_json by first parsing it into a single polymorphic map.

#[derive(Deserialize)]
struct A {
  field1: String,
  ...
}

#[derive(Deserialize)]
struct B {
  field2: String,
  ...
}

#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "lowercase")]
enum ElementType {
  A(A),
  B(B),
}

#[derive(Deserialize)]
struct Element {
    id: u64,
    #[serde(flatten)]
    type_: ElementType,
}

Then I convert this structure to my desired format.

struct Document {
  a_fields: HashMap(u64, A),
  b_fields: HashMap(u64, B),
}

impl From<Vec<Element>> for Document {
    fn from(input: Vec<Element>) -> Document {
        let mut result = Document{ a_fields : HashMap::new(), b_fields : HashMap::new() };
        for element in input.into_iter() {
            match element.type_ {
                ElementType::A(a) => {
                    result.a_fields.insert(element.id, a);
                    ()
                }
                ElementType::B(b) => {
                    result.b_fields.insert(element.id, b);
                    ()
                }
            }
        }
        result
    }
}

This works fine, but it's not that efficient because it first reads the entire JSON into one structure, then moves it to another. What's the best way to directly deserialize the JSON into Document?

1

There are 1 best solutions below

0
drewtato On

This is easily done with a custom Deserialize implementation. It's basically pure boilerplate code wrapping the match statement you already wrote.

#[derive(Debug, PartialEq, Eq)]
pub struct Document {
    pub a_fields: HashMap<u64, A>,
    pub b_fields: HashMap<u64, B>,
}

impl<'de> Deserialize<'de> for Document {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: serde::Deserializer<'de>,
    {
        struct DocumentVisitor;

        impl<'de> serde::de::Visitor<'de> for DocumentVisitor {
            type Value = Document;

            fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
                write!(formatter, "a list of A and B items")
            }

            fn visit_seq<S>(self, mut seq: S) -> Result<Self::Value, S::Error>
            where
                S: serde::de::SeqAccess<'de>,
            {
                let mut document = Document {
                    a_fields: HashMap::new(),
                    b_fields: HashMap::new(),
                };

                while let Some(element) = seq.next_element::<Element>()? {
                    match element.type_ {
                        ElementType::A(a) => {
                            document.a_fields.insert(element.id, a);
                        }
                        ElementType::B(b) => {
                            document.b_fields.insert(element.id, b);
                        }
                    }
                }

                Ok(document)
            }
        }

        deserializer.deserialize_seq(DocumentVisitor)
    }
}

Playground