Excessive RAM usage for TFRecord file while feeding into any model

42 Views Asked by tinykishore At 18 January 2024 at 18:36

Problem

I have a *.tfrecords file that I want to feed in a ConvLSTM2D model, created using Tensorflow. Here is the model structure.

model = Sequential([
    ConvLSTM2D(64, (3, 3), activation='relu', input_shape=(20, 224, 224, 3), return_sequences=True),
    BatchNormalization(),
    ConvLSTM2D(64, (3, 3), activation='relu', return_sequences=True),
    BatchNormalization(),
    Flatten(),
    Dense(1, activation='sigmoid')
])

When i try to fit my data into the model, it takes up all of the system ram.

Tested on M1 MacBook 2020 (Jupyter Notebook, Pycharm), Google Colab.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(train_input_fn(), steps_per_epoch=5,validation_data=val_input_fn(),epochs=10)

What we are doing

We have shanghai dataset which have fight and non fight dataset. so we are trying to predict and classify fight and non fight video using Convolutional Long Short term Memory.

We have 800 train videos. We capture frames with 250 ms interval and convert all into numpy array. Then we store all the arrays in a TFRecord file.

When we pass the dataset into our model, we do so using this function train_input_fn() which reads the tfrecord file and pass data into our model.

You can see the colab notebook from here

Dataset Structure is given below:

Dataset
    - train
        - Fight      # has 800 *.avi files
        - NonFight   # has 800 *.avi files
    - val
        - Fight      # has 200 *.avi files
        - NonFight   # has 200 *.avi files

What we have tried?

We have tried minimizing batch_size from 64 down to 16.
Reduced the whole dataset from 800 videos down to 200 videos in train set
Tried to reduce the filter size of ConvLSTM2D
Did all the same things with *.mp4
Reduced one layer of BatchNormalization() and ConvLSTM2D

Original Q&A

There are 1 best solutions below

Govind Hrishikesh On 28 February 2024 at 11:37

Try to do it using pySpark rather than using up your own RAM.You will find most of the big data based ML/DL solutions will be done using Spark.

Excessive RAM usage for TFRecord file while feeding into any model

Problem

Tested on M1 MacBook 2020 (Jupyter Notebook, Pycharm), Google Colab.

What we are doing

What we have tried?

There are 1 best solutions below

Related Questions in TENSORFLOW

Related Questions in MACHINE-LEARNING

Related Questions in DEEP-LEARNING

Related Questions in TFRECORD

Trending Questions

Popular # Hahtags

Popular Questions