NPL Keras transformers model not converging

19 Views Asked by ary soft At 25 March 2024 at 10:41

I've been trying for weeks to create a conversational model based on transformers. I've tried this three different examples:

In all cases, if I use the english to spanish dataset the model converges correctly but when I try to use a dialogs dataset (I've tried several including the Cornell's movie corpus) the result is always the same. The loss stays over 4 or 5 and the model never converges.

I've tried the models as shown in the examples and also tried many different configurations changing the number of samples (from 1000 to 100000), batch size (from 1 to 64), initial learning rate and lronplateau, optimizers, number of epochs but the results stay always the same. After 10 or 20 epochs loss and accuracy (both on training and validation) level and the model stops learning.

What am I doing wrong? Are those examples really working for someone? Is a sequential model with sparse categorical crossentropy and accuracy metrics the right model for conversations?

Thanks

Original Q&A

NPL Keras transformers model not converging

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in KERAS

Related Questions in NLP

Related Questions in TRANSFORMER-MODEL

Trending Questions

Popular # Hahtags

Popular Questions