Some problems when using CTCloss for ASR tasks

47 Views Asked by mixxis At 02 October 2023 at 03:15

When using torch.nn.CTCloss, why does my loss curve converge but the model seems to repeatedly output only a few tokens? For example: My label is: [220, 1122, 172, 26, 460, 836, 171, 1813, 113, 39, 83, 61, 267, 38, 202, 223] However, the output of the model is: [100, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1].

where 1 is a whitespace character. The two are very different.

Why does this happen?

What I did:

Remove the blank parts at the beginning and end of the speech, resample the voice with 16000Hz, and padding to the same length
Use WAV2VEC2.0 as the feature extractor and use two fully connected layers for token classification
log_softmax (-1) to the output
Using ctcloss, the parameter inputs are: the model output after log_softmax (T, N, C), tag sum(label_length), there are no white space characters in this tag, it is composed of the labels of each sentence spliced together, sequence length (N), label length (N)

I expect my output to be close to the label

Original Q&A

Some problems when using CTCloss for ASR tasks

There are 0 best solutions below

Related Questions in SPEECH-RECOGNITION

Related Questions in SPEECH

Related Questions in CTC

Trending Questions

Popular # Hahtags

Popular Questions