BERT (BASE UNCASED) custom model is trained on 1.2 million texts for text classification task for 97 categories. Validation and Test data sets are around 250k. Since predicting entire test data doesn't fit in the memory, I am predicting the probabilities in batches. But I am getting different probably scores at the time of inference for same text present in different batches. Request guidance to resolve this issue.

BERT model should predict same probability for same text passed in different batches.

0

There are 0 best solutions below