Understanding pocketsphinx keyword list file format

807 Views Asked by At

I'm testing an app based on the demo avaialble on GitHub using the Spanish language model in which I want it to continuously listen for a small set of keywords and act accordingly, however I'm still an amateur on this subject. My main questions right now are the following:

Given my current setupRecognizer method

private void setupRecognizer(File assetsDir) throws IOException {

    recognizer = SpeechRecognizerSetup.defaultSetup()
            .setAcousticModel(new File(assetsDir, "es-ptm"))
            .setDictionary(new File(assetsDir, "es.dict"))
            .setRawLogDir(assetsDir)
            .getRecognizer();
    recognizer.addListener(this);

    File actionGrammar = new File(assetsDir, "actions.list");
    recognizer.addKeywordSearch(SEARCH, actionGrammar);

    File languageModel = new File(assetsDir, "es_model.lm");
    recognizer.addNgramSearch(SEARCH, languageModel);

    startSearch(SEARCH);
}

What happens by adding both addKeywordSearch and addNGramSearch, under the same identifier string ("SEARCH" in my code)? Am I improving the recognition or making it worse?

In a desperate attempt, I reduced the dictionary to only the words I want to be recognized, such as this:

atrás a t r a s 
listo l i s t o 
listo(2) l i s t a
listo(3) l i s t a s
listos(4) l i s t o s
repetir rr e p e t i r
repetir(2) rr e p e t i d o
repetirse(3) rr e p e t i r s e

It is now reduced to only recognizing these words, but it misbehaves a lot, identifying words I didn't say. I'm guessing PocketSphinx is probability-based and since I reduced the dictionary these words have high probability of being recognized. Am I correct?

Also in an attempt to improve my accuracy, I made this actions.list

listo /1.0/
atrás /1.0/
repetir /1.0/

Although I'm not really sure what this value means. It says on the documentation to use 1e-1 for smaller words, and increase to 1e-50 for bigger words. What notation is this and what does it mean?

I'm really concerned about making it as accurate as possible, am I on the right path?

Thanks in advance!

1

There are 1 best solutions below

2
Nikolay Shmyrev On

What happens by adding both addKeywordSearch and addNGramSearch, under the same identifier string ("SEARCH" in my code)?

The ngram search replaces keyword search, keyword search is garbage collected

What notation is this

What is E in floating point?