Update What the state of the art way to build LSTM data: data_generator or tf.data.Dataset.window?

28 Views Asked by Jonathan Roy At 11 September 2023 at 19:08

I always did my own code to format my data (3D, normalize...) for my LSTM models. Now I have to work with bigger dataset and need to ingest many csv files. What is the best way to make all the work fast (reducing IO) and memory efficiency.

Tensorflow suggest a data generator and finaly convert data set to data.dataset and I found guy doing thing like this:

WINDOW_SIZE = 72
BATCH_SIZE = 32
dataset = (
    tf.data.Dataset.from_tensor_slices(dataset_train)
    .window(WINDOW_SIZE, shift=1)
    .flat_map(lambda seq: seq.batch(WINDOW_SIZE))
    .map(lambda seq_and_label: (seq_and_label[:,:-1], seq_and_label[-1:,-1]))
    .batch(BATCH_SIZE)
)

I realy want to learn the best way, my goal is to use my code in production and learn in the futur more about Mlops. Thank for your help and if you have good explained exemple to set up 3d lstm data.dataset, I take all suggestion

Original Q&A

Update What the state of the art way to build LSTM data: data_generator or tf.data.Dataset.window?

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in LSTM

Related Questions in TF.DATA.DATASET

Trending Questions

Popular # Hahtags

Popular Questions