Creating dataframe with missing values

48 Views Asked by At

I am experienced in Polars' Python package and just starting to use the Rust crate.

In a function I want to return a DataFrame that almost certainly has columns with missing values. My current approach is to create vectors with sentinel values as a starting point for a DataFrame and then I hope to replace those values with nulls. But I'm not having much success.

I can create the vectors and DataFrame with something like this

let mut a_vec: Vec<i64> = Vec::with_capacity(10);

for i in 0..10 {
    if <condition> {
        a_vec[i] = 1;
    } else {
        a_vec[i] = std::i64::MAX
    }
}

let mut df: DataFrame = df!("a" => a_vec).unwrap();

Now I want to replace std::i64::MAX with null.

In Python Polars I can run use the replace method, but I haven't found a (good) way to this in Rust.

If there is a better way to do this where I can avoid the sentinel values I'm all ears.

1

There are 1 best solutions below

0
Chayim Friedman On BEST ANSWER

The proper way is to create a vector of Options:

let mut a_vec: Vec<Option<i64>> = Vec::with_capacity(10);

for i in 0..10 {
    if i != 5 {
        a_vec.push(Some(1));
    } else {
        a_vec.push(None);
    }
}

let mut df: DataFrame = df!("a" => a_vec).unwrap();

You can also use chunked array builders, it should be more efficient:

let mut a_vec: PrimitiveChunkedBuilder<Int64Type> = PrimitiveChunkedBuilder::new("a", 10);

for i in 0..10 {
    if i != 5 {
        a_vec.append_value(1);
    } else {
        a_vec.append_null();
    }
}

let a_vec = a_vec.finish();

let mut df: DataFrame = df!("a" => a_vec).unwrap();