What kind of wav or wave sound data format is required in vosk nodejs library for speech recognition?

59 Views Asked by JRichardsz At 14 February 2024 at 02:11

Vosk is a speech recognition framework. In the provided samples, they use a wav recorded directly from microphone (native) and it works

My requirement is to get the sound from a stream (socket) instead of local microphone but vosk is not detecting the buffer as valid wav file.

rec.acceptWaveform returns [true] when buffer comes from microphone

const rec = new vosk.Recognizer
var mic = require("mic");
var micInstance = mic({...});

micInputStream.on('data', async (buffer) => {    
    if (rec.acceptWaveform(buffer)){
      console.log("rec.result():", rec.result())

rec.acceptWaveform returns [false] when buffer comes from socket client

const io = require('socket.io')(server, { maxHttpBufferSize: 1e7 })
rec = new vosk.Recognizer({model: model, sampleRate: sampleRate});  
io.on('connection', function (socket) {
  socket.on('send-audio', async function (data) {
    console.log("received:", data)
    if(rec.acceptWaveform(data)){
      console.log("rec.acceptWaveform:", true)
      console.log("rec.result():", rec.result())
    }else{
      console.log("rec.acceptWaveform:", false)
    }

Attempts and Research

According to the nodejs vosk library, buffer should be an audio data in PCM 16-bit mono format
I'm using this nodejs library to inspect the wave buffer called wavefile
The buffer received from socket can be read as wav file using the library wavefile. The wave details confirms that is a valid wave file, but for vosk is not a wave format. I also can save it directly as file and audacity is able to read it.
The buffer received directly from microphone cannot be read with wavefile library but for vosk is a valid wave format. If I save the buffer as file, using audacity the wave file is not valid
I also tried sending only the data section bytes without success.
I raised 02 issues
- https://github.com/alphacep/vosk-api/issues/1504
- https://github.com/rochars/wavefile/issues/42
I will try with python just to test if it is a bug with wave from socket clients.

Question

What kind of wav or wave data format is required in vosk nodejs library for speech recognition?

Reproducible sample

I create a reproducible source code:

https://github.com/jrichardsz/nodejs-wav-vosk-transcription

Original Q&A

What kind of wav or wave sound data format is required in vosk nodejs library for speech recognition?

Attempts and Research

Question

Reproducible sample

There are 0 best solutions below

Related Questions in NODE.JS

Related Questions in SPEECH-RECOGNITION

Related Questions in WAV

Related Questions in WAVE

Related Questions in VOSK

Trending Questions

Popular # Hahtags

Popular Questions