Tensorflow gelu out of memory error, but not relu

207 Views Asked by At

Gelu activation causes my model to return an OOM error when training, but when I switch to relu the problem goes away. Even if the model is doubled in size, the model with relu performs fine.

if activation=="gelu":
    out = tfa.layers.GELU()(out)
elif activation=="relu":
    out = KL.ReLU()(out)

The OOM error does not happen on the gelu function, but since the two models are the same except for the difference in activation function, I don't think this is the issue.

    File ".../python3.9/site-packages/keras/backend.py", line 3693, in resize_images
  x = tf.image.resize(x, new_shape, method=interpolations[interpolation])
Node: 'model/up_sampling2d_2/resize/ResizeNearestNeighbor'
2 root error(s) found.
  (0) RESOURCE_EXHAUSTED:  OOM when allocating tensor with shape[8,320,240,64] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node model/up_sampling2d_2/resize/ResizeNearestNeighbor}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. This isn't available when running in Eager mode.
0

There are 0 best solutions below