Can Low gradient norm be an indicator of a problem in my deep learning model?

66 Views Asked by Repo1 At 02 August 2023 at 11:47

Anyone knows how to troubleshoot deep learning model training through gradient norm ? I am reproducing a research paper work but I'm not getting same results as theirs. I am training a model of 16 residual blocks with ReLU activation function, categorical cross entropy, StepLRScheduler and Adam optimizer. number of epochs = 10, batch size = 12, starting lr = 0.001 and then it decays by 0.5 every epoch after the 6th epoch. the dataset is imbalanced so I tried weighted loss and focal loss and nothing seems to work. I tried to troubleshoot the training by computing the gradient norm every 1000 batch and noticed the values are small (around 1e-4) is this normal or can this be an indicator of some problem ?

I looked for references that may help in interpreting gradient norm and how to use it to troubleshoot my model but I found nothing. My second question is : Does anyone have interesting ressources on how to troubleshoot and diagnose deep learning models ?

Original Q&A

Can Low gradient norm be an indicator of a problem in my deep learning model?

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in DEEP-LEARNING

Related Questions in PYTORCH

Related Questions in DEEP-RESIDUAL-NETWORKS

Trending Questions

Popular # Hahtags

Popular Questions