Get an audio stream record of user voice as it streams in

196 Views Asked by At

We currently use voicexml and https://www.plumvoice.com/ to get voice recordings which are then sent to our backend server for processing similar to the example mentioned in the docs for recording user input: https://www.plumvoice.com/docs/dev/developer_reference:tutorial

<?xml version="1.0"?>
<vxml version="2.0">
    <form>
        <record name="myrecording" type="audio/x-wav" beep="true">
            <prompt>
                Please record a message after the beep.
            </prompt>

            <filled>
                You just recorded the following message:
                <value expr="myrecording"/>
                <submit next="submitrecording.php" namelist="myrecording"
                method="post" enctype="multipart/form-data"/>
            </filled>
        </record>
    </form>
</vxml>

This works fine and gives a wav file at the end of user input. Is there a way to get the user input as an Audio Stream as the user speaks instead of a file at the end?

2

There are 2 best solutions below

1
Dr Yuan Shenghai On

Rather than reinventing the wheel, you can use FFMPEG, advertised as “A complete, cross-platform solution to record, convert and stream audio and video.”

ffmpeg -re -i input -f rtsp -muxdelay 0.1 rtsp://server/live.sdp

you can select option say no latency "-preset ultrafast -tune zerolatency" or capture from Logitec C930 camera "-i /dev/video0" or video file "-i your_file_location"

One example I can give is how I stream my webcam with sound to an online server

lxterminal -e ffmpeg -f v4l2 -framerate 30 -video_size 800x448 -i /dev/video0 -i /home/pi/Desktop/sound/ic_ch.png -codec:v h264 -r 30 -s 800x448 -bf 0 -g 30 -bufsize 8000k -maxrate 8000k -filter_complex "[0:v][1:v] overlay=(W-w)/2:(H-h)/2:enable='gte(t,1)'" -preset ultrafast -tune zerolatency -f h264 udp://192.168.5.10:23003 & sleep 0.1

Dont be scared, you dont need all the option. Just take the Input and output and encoding standard you are good to go.

0
gawi On

No. According to VoiceXML W3C recommendation, the content of the recording is only available when the recording is complete (i.e. final silence or DTMF input). There is no streaming facility with VoiceXML.

If you need this kind of streaming API, you might want to take a look at Live Media Streaming in Amazon Connect