Concise way to retrieve a row from a Polars DataFrame with an iterator of column-value pairs

1.6k Views Asked by At

I often need to retrieve a row from a Polars DataFrame given a collection of column values, like I might use a composite key in a database. This is possible in Polars using DataFrame.row, but the resulting expression is very verbose:

row_index = {'treatment': 'red', 'batch': 'C', 'unit': 76}

row = df.row(by_predicate=(
    (pl.col('treatment') == row_index['treatment'])
    & (pl.col('batch') == row_index['batch'])
    & (pl.col('unit') == row_index['unit'])
))

The most succinct method I've found is

from functools import reduce
from operator import and_

expr = reduce(and_, (pl.col(k) == v for k, v in row_index.items()))

row = df.row(by_predicate=expr)

But that is still verbose and hard to read. Is there an easier way? Possibly a built-in Polars functionality I'm missing?

1

There are 1 best solutions below

3
jqurious On BEST ANSWER

(a == b) & (c == d) will return true if all of the conditions are true.

Another way to express this is with pl.all_horizontal()

pl.all_horizontal(a == b, c == d)

To which you can pass your comprehension directly:

expr = pl.all_horizontal(
    pl.col(k) == v for k, v in row_index.items()
)

df.row(by_predicate=expr)