Tesseract whitelist is not accepting special characters

190 Views Asked by At

I'm using Tesseract (version 5.3.1) in Windows to recognize characters from a text that includes special characters like ñüá. Most of these characters are within the Latin script, so I've declared this in the command line.

In this image, the special characters are ñ,Ñ,á,é.

Example of the text

The command line I'm using is

 tesseract text.png stdout --psm 6 -l Latin -c tessedit_char_whitelist=aáeéiocfhklmnñtÑ

However, the output text is missing white spaces between words, and the special characters are being completely ignored, resulting in:

aoloaalcalmoo
okonioniachillalif

Do you know why tesseract is not taking into account the characters I've declared in the whitelist? Maybe I'm not correctly specifying the special characters

Any help is greatly appreciated.

0

There are 0 best solutions below