Parallel json deserialization fails with valid json

151 Views Asked by At

I want to deserialize json values in parallel using rayon. A valid json from the serde-json example fails when trying to deserialize inside par_iter, despite being parsed correctly without parallelization. This is the code:

use rayon::prelude::*; // 1.7.0
use serde_json::{Result, Value};

fn main() -> Result<()> {
    let data = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;
    let v: Value = serde_json::from_str(data)?;
    println!("Please call {} at the number {}", v["name"], v["phones"][0]);

    let mut batch = Vec::<String>::new();
    batch.push(data.to_string());
    batch.push(data.to_string());
    
    let _values = batch.par_iter()
        .for_each(|json: &String| {
            serde_json::from_str(json.as_str()).unwrap()
        });
        
    Ok(())
}

and this is the error

thread 'thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Error("invalid type: map, expected unit", line: 2, column: 8)', src/main.rs:23:49

Link to the Playground.

IIRC, I've seen other par_iter examples that use unwrap inside. Is this not recommended? In my case, I want to do it because I need the program to panic if an invalid input comes in.

1

There are 1 best solutions below

0
Finomnis On BEST ANSWER

serde_json::from_str determines its output type automatically from the type of variable it gets written into. In your case, however, for_each doesn't expect a return value, so from_str attempt to deserialize it into a ().

Use map().collect() together with a : Vec<Value> annotation to make this work:

use rayon::prelude::*; // 1.7.0
use serde_json::{Result, Value};

fn main() -> Result<()> {
    let data = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;
    let v: Value = serde_json::from_str(data)?;
    println!("Please call {} at the number {}", v["name"], v["phones"][0]);

    let mut batch = Vec::<String>::new();
    batch.push(data.to_string());
    batch.push(data.to_string());

    let values: Vec<Value> = batch
        .par_iter()
        .map(|json: &String| serde_json::from_str(json.as_str()).unwrap())
        .collect();

    println!("Values:\n{:#?}", values);

    Ok(())
}
Please call "John Doe" at the number "+44 1234567"
Values:
[
    Object {
        "age": Number(43),
        "name": String("John Doe"),
        "phones": Array [
            String("+44 1234567"),
            String("+44 2345678"),
        ],
    },
    Object {
        "age": Number(43),
        "name": String("John Doe"),
        "phones": Array [
            String("+44 1234567"),
            String("+44 2345678"),
        ],
    },
]

Although honestly, it's a little weird to use serde::Value; usually people deserialize directly into a struct:

use rayon::prelude::*;
use serde::{Deserialize, Serialize};
use serde_json::Result;

#[derive(Debug, Serialize, Deserialize)]
struct Entry {
    name: String,
    age: u32,
    phones: Vec<String>,
}

fn main() -> Result<()> {
    let data = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;
    let v: Entry = serde_json::from_str(data)?;
    println!("Please call {} at the number {}", v.name, v.phones[0]);

    let mut batch = Vec::<String>::new();
    batch.push(data.to_string());
    batch.push(data.to_string());

    let values: Vec<Entry> = batch
        .par_iter()
        .map(|json: &String| serde_json::from_str(json.as_str()).unwrap())
        .collect();

    println!("Values:\n{:#?}", values);

    Ok(())
}
Please call John Doe at the number +44 1234567
Values:
[
    Entry {
        name: "John Doe",
        age: 43,
        phones: [
            "+44 1234567",
            "+44 2345678",
        ],
    },
    Entry {
        name: "John Doe",
        age: 43,
        phones: [
            "+44 1234567",
            "+44 2345678",
        ],
    },
]