Loss and CER increases when training a fine-tuned TrOCR on custom dataset

472 Views Asked by At

I want to train the TrOCR on my custom dataset of receipts. Since we are to use the OCR for receipts we chose the “printed” fine-tuned model. We use a dataset of 5000 bounding boxes where each contains a word. However we experience that all metrics (cer, precision) and loss worsens for each epoch we run. We can't figure out why the model performs more poorly for each epoch.

Processor, model and optimizer is shown below:

processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-printed")
model = VisionEncoderDecoderModel.from_pretrained("microsoft/trocr-base-printed")
optimizer = optim.AdamW(model.parameters(), lr=5e-5)

Training method:

for epoch in range(self.epochs):
    self.model.train()
    train_loss = 0.0
    for batch in tqdm(self.train_dataloader):
        for k, v in batch.items():
            batch[k] = v.to(self.device)
        outputs = self.model(**batch)
        loss = outputs.loss
        loss.backward()
        self.optimizer.step()
        self.optimizer.zero_grad()
        train_loss += loss.item()

Does anyone know what we might be doing wrong?

When evaluating the model out-of-the-box, it performs well but we wish to continue training on our receipts to hopefully improve the model.

1

There are 1 best solutions below

0
Ajeet Singh On

you have chosen the already fine tuned model. "microsoft/trocr-base-printed" is already fine tuned on SROIE dataset. So there is no point of fine tuning already fine tuned model. Instead choose only pre trained model like trocr-base-stage1 or trocr-small-stage1.