I'm writing a nodejs app that hosts a webhook for answering a call. The basics are all working. That is I can receive a call and provide a TWIML response. However, I want to receive from the caller the audio stream and I also want to stream to the caller from an external source.
I'm using the twiml response to setup this flow.
I've discovered the audio I receive from that socket is packaged in a JSON object. I'm presuming the actual audio payload in that JSON object is base64 encoded. So I can receive the caller audio and extract the payload and forward to my external source. However, my understanding of the sequence is that it provides bi-directional audio flow. So my question is "How do I package the upstream audio packet I want to send to the caller"?
Do I have create a JSON object containing a similarly formatted payload? Do I have to first start the upstreaming with the "start" message? Or do I simply create "media" messages with the payload (base64 encoded)?
Here's what I'm currently doing:
- Parse 'media' message from twilio and extract and decode payload of that message and send to external resource.
- When I receive a G.711 ulaw encoded packet from my external resource, at the moment, I'm just sending it as is through the websocket. I certainly don't hear anything at the far end however.
I do know my packets in fact do contain valid G.711 ulaw data from the xternal resource