Am i overfitting?

1.9k Views Asked by At

Val loss(Blue) & Train Loss(Orange)

How it looks like with lesser smoothing

Hi! I am currently training my model with Darkflow Yolov2. The optimiser is SGD with lr 0.001. Based on this graph, my val loss > train loss, which would mean that it is overfitting? If it is, what would be the recommended course of action? It seems weird because both losses are decreasing, but the val loss is slower.

For more info, My train dataset consist of 400 images per class, with single annotations,with a total of 2800 images. I did this to prevent class imbalance, by only annotating one class instance per image. My val dataset consist of 350 images , with multiple annotations. Basically, i annotated every object within the images. I have 7 classes and my train-val-test split is 80-10-10. Is this the cause for the val loss?

1

There are 1 best solutions below

1
Prune On

Over-fitting detection includes a mismatch as training accuracy diverges from test (validation) accuracy. Since you haven't provided that data, we can't evaluate your model.

It might help to clarify stages and terms; this should let you answer the question for yourself in the future:

"Convergence" is the point in training at which we believe that the model

  • has learned something useful;
  • has reached this point via reproducible process;
  • isn't going to get significantly better;
  • is about to get worse.

Convergence is where we want to stop training and save (checkpoint) the model for production use.

We detect convergence by use of training passes and testing (validation) passes. At convergence, we expect:

  • validation loss (error function, perplexity, etc.) is at a relative minimum;
  • validation accuracy is at a relative maximum;
  • validation and training metrics are "reasonably stable", with respect to the model's general behaviour;
  • training accuracy and validation accuracy are essentially equal.

Once a training run passes this point, it often transitions into "over-fitting", in which the model learns things so specific to the training data, that it is no longer as good at inferring about new observations. In this state,

  • training loss drops; validation loss rises;
  • training accuracy rises; validation accuracy drops.