Annotation specs - AutoML (GCP)

673 Views Asked by At

I'm using the Natural Language module on Google Cloud Platform and more specifically AUTOML for text classification. I come across this error which I do not understand when I have finished importing my data and the text has been processed :

Error: The dataset has too many annotation specs, the maximum allowed number is 5000.

What does it mean? Have you already got it?

Thanks

2

There are 2 best solutions below

0
Kim On BEST ANSWER

Take a look at the AutoML Quotas & Limits documentation for better understanding.

It seems that you are touching the highest limit of labels per dataset. Check it on the AutoML limits --> Labels per dataset --> 2 - 5000 (for classification).

Take into account that limits, unlike quotas, cannot be increased.

0
ML_noob On

I also got this error while I was certain that my number of labels are below 5000. It turns out to be an error with my CSV formatting.

When you create your text data using to_csv() in Pandas, it will only quotes that part of text data that contains comma, while AutoML Text wants you to quote all lines of the text. I have written the solution in this Stackoverflow answer