Using rayon's parallel iterators for a tree object

70 Views Asked by At

I have a Vec<ValidRelationshipTree> and I would like to use rayon's parallel iteration to process each tree in parallel. I cannot do so as my ValidRelationshipTree doesn't satisfy std::marker::Sized.

Definition of ValidRelationshipTree, whose size is clearly known at compile time:

// size = 16 (0x10), align = 0x8
struct ValidRelationshipTree {
    root: RefCell<Rc<ValidRelationshipNode>>,
}

Compilation error:

the method `into_par_iter` exists for struct `Vec<ValidRelationshipTree>`, but its trait bounds were not satisfied
the following trait bounds were not satisfied:
`[metadata::relationships::ValidRelationshipTree]: std::marker::Sized`
which is required by `[metadata::relationships::ValidRelationshipTree]: rayon::iter::IntoParallelIterator`
`[metadata::relationships::ValidRelationshipTree]: rayon::iter::ParallelIterator`
which is required by `[metadata::relationships::ValidRelationshipTree]: rayon::iter::IntoParallelIterator`

Since my structure's size is known at compile time, I don't understand how std::marker::Sized isn't implemented. I've tried to read through this trait's documentation, researching similar rayon issues, but I just don't understand. I even tried creating a wrapper struct around my Vec<ValidRelationshipTree> but no dice.

If anyone could help fill in my gap of knowledge that would be greatly appreciated <3.

In case it's relevant, here is the definition of ValidRelationshipNode:

struct ValidRelationshipNode {
    payload: String,
    extended: String,                             
    parent: RefCell<Weak<ValidRelationshipNode>>, // Use Weak<T> to avoid reference cycles
    children: RefCell<Vec<Rc<ValidRelationshipNode>>>,
}

Concrete code example

The following sequential block compiles just fine:

let forest: Vec<ValidRelationshipTree> = self.generate_relationship_trees(n_tables, true);

let mut extended: Vec<String> = forest
    .into_iter()
    .flat_map(|tree: ValidRelationShipTree| {
        tree.nodes_with_length(n_tables)
            .into_iter()
            .map(|node: Rc<ValidRelationshipNode>| node.extended().clone())
    })
    .collect();

And a trivial test case that errors:

let forest: Vec<ValidRelationshipTree> = self.generate_relationship_trees(n_tables, true);

let extended_par: Vec<String> = forest
    .into_par_iter() // ERROR: the following trait bounds were not satisfied:
                     // [metadata::relationships::ValidRelationshipTree]: std::marker::Sized`
    .map(|node| node.extended().clone())
    .collect();
2

There are 2 best solutions below

0
drewtato On BEST ANSWER

If you instead call into_par_iter as an associated function on Vec, you get a more helpful error:

Vec::into_par_iter(forest)
error[E0277]: `Rc<ValidRelationshipNode>` cannot be sent between threads safely
  --> src/lib.rs:44:5
   |
44 |     Vec::into_par_iter(forest)
   |     ^^^ `Rc<ValidRelationshipNode>` cannot be sent between threads safely
   |
   = help: the trait `Send` is not implemented for `Rc<ValidRelationshipNode>`
   = help: the following other types implement trait `rayon::iter::IntoParallelIterator`:
             Vec<T>
             &'data Vec<T>
             &'data mut Vec<T>
   = note: required for `RefCell<Rc<ValidRelationshipNode>>` to implement `Send`
note: required because it appears within the type `ValidRelationshipTree`
  --> src/lib.rs:5:12
   |
5  | pub struct ValidRelationshipTree {
   |            ^^^^^^^^^^^^^^^^^^^^^
   = note: required for `Vec<ValidRelationshipTree>` to implement `rayon::iter::IntoParallelIterator`

The simplest way to get around this is to use Arc instead of Rc and Mutex or RwLock instead of RefCell.

use std::sync::Mutex;
use std::sync::{Arc, Weak};

struct ValidRelationshipTree {
    root: Mutex<Arc<ValidRelationshipNode>>,
}

struct ValidRelationshipNode {
    payload: String,
    extended: String,
    parent: Mutex<Weak<ValidRelationshipNode>>,
    children: Mutex<Vec<Arc<ValidRelationshipNode>>>,
}

Then this code probably works, depending on the other functions involved:

forest
    .into_par_iter()
    .flat_map(|tree| {
        tree.nodes_with_length(n_tables)
            .into_par_iter()
            .map(|node| node.extended().clone())
    })
    .collect()
0
ejovo13 On

The error is probably misleading and the actual problem likely lies in the fact that the ValidRelationshipTree and ValidRelationshipNode structures are using Rc and RefCell which are not thread-safe.