Flask FileSystem Cache for Large Datasets

2k Views Asked by At

I'm having a Dash Application which takes in multiple CSV files, and creates a combined Dataframe for analysis and visualization. Usually, this computation takes around 30-35 seconds for datasets of size 600-650 MB. I'm using Flask Filesystem cache to store this dataframe once, and every next time I request the data, it comes from Cache.

I used the code from Dash's example here

I'm having two problems here :

  1. It seems since the cache is in Filesystem, it takes twice the amount of time (nearly 70 seconds) to get the Dataframe, the first try, then it comes quickly from the subsequent requests. Can I use any other Cache type to avoid this overhead?

  2. I tried automatically clearing my cache by setting CACHE_THRESHOLD (for example, I had set it to 1), but it's not working and I see files getting on added in the directory.

Sample Code :

app = dash.Dash(__name__)

cache = Cache(app.server, config={
    'CACHE_TYPE' : 'filesystem',
    'CACHE_DIR' : 'my-cache-directory',
    'CACHE_THRESHOLD': 1
})

app.layout = app_layout

@cache.memoize()
def getDataFrame():
   df = createLargeDataFrame()
   return df

@app.callback(...)  # Callback that uses DataFrame
def useDataFrame():
   df = getDataFrame()
   # Using Dataframe here
   return value

Can someone help me with this? Thanks.

0

There are 0 best solutions below