Fastest way to structure 3D array in vaex for filtering

27 Views Asked by At

I have an application where I am dealing with a large 3 dimensional array, containing an index mapping to a geographic location (~10k unique values), a timestamp (hourly for a full year, leading to ~9k values), and ~20 different values for each index and timestamp. These are supposed to be displayed in a dash dashboard, where users can filter on both index and timestamps, and obtain the values with as little delay as possible.

I saw the taxi data example at https://dash.vaex.io/ which has impressive performance on a similar sized application and decided to experiment with vaex.

I could replicate the the taxi example by reshaping but would lose the ordering of either index of timestamps. This way seems counter-intuitive to me as you lose the structure and can no longer filter using both row/column.

I could also generate one data frame for each value, and keep both the index as rows and timestamps as columns. This way both the index and timestamps are preserved as rows/columns, but I get multiple dataframes. I have also read that vaex does not work well with wide dataframes.

What is the best practise? Is there some better way I have not thought about? Should I look into spark or other tools.

0

There are 0 best solutions below