I'm working on a ReactJS project with TypeScript where I need to implement an audio recording feature that captures audio from both the microphone and speaker simultaneously. The recorded audio should then be sent to the server for transcription.
I have already set up the basic audio recording using the MediaRecorder API to capture audio from the microphone. However, I'm unsure about how to capture audio from the speaker simultaneously. I also need guidance on how to send the recorded audio to the server for transcription.
I'm using Socket.io to communicate with the server, and the server is set up to handle audio transcription.
My questions are:
How can I modify the TranscriptComponent to record audio from both the microphone and speaker simultaneously? How can I send the recorded audio to the server for transcription using Socket.io? Any guidance, code examples, or resources would be greatly appreciated. Thank you!
Here's what I have so far in my TranscriptComponent:
// TranscriptComponent.tsx
import React, { useState } from 'react';
// ... (other imports and interfaces)
export const TranscriptComponent = (props: TranscriptComponentProps) => {
// ... (other state variables and logic)
const startRecording = (id: string) => {
socket.connect();
console.log("=recording started");
navigator.mediaDevices.getUserMedia({ audio: true }).then((stream) => {
const mimeTypes = ["audio/mp4", "audio/webm"].filter((type) =>
MediaRecorder.isTypeSupported(type)
);
if (mimeTypes.length === 0) return alert("Browser not supported");
setIsRecording(true);
setStream(stream);
setTimerInterval(
setInterval(() => {
setTranscriptLength((t) => t + 1);
}, 1000)
);
let recorder = new MediaRecorder(stream, { mimeType: mimeTypes[0] });
recorder.addEventListener("dataavailable", async (event) => {
console.log("cheking data available for send");
if (event.data.size > 0 && socket.connected) {
console.log("sending audio");
socket.emit("audio", { roomId: props.roomId, data: event.data });
}else {
console.log("no data avialable");
}
});
recorder.start(1000);
});
};
const stopRecording = () => {
stream!.getTracks().forEach((track) => track.stop());
setIsRecording(false);
clearInterval(timerInterval);
socket.emit("stop-transcript", { roomId: props.roomId });
console.log("recording stopped");
// socket. Close();
};
// ... (return and rendering logic)
};
To implement simultaneous audio recording from both the microphone and speaker, you'll need to use the Web Audio API. The MediaRecorder API only allows recording from the microphone or screen, not both at the same time. The Web Audio API, on the other hand, allows you to create audio sources from both the microphone and the speaker and mix them together for recording.
Here's how you can modify the TranscriptComponent to achieve simultaneous audio recording:
We use the navigator.mediaDevices.getDisplayMedia method to capture audio from the speaker along with getUserMedia to capture audio from the microphone.
We create an AudioContext and connect both the microphone and speaker sources to a single destination using createMediaStreamSource and createMediaStreamDestination.
We then create a MediaRecorder instance using the combined stream from the destination, which includes both microphone and speaker audio.
Now, to send the recorded audio to the server for transcription using Socket.io, you can modify the server-side code to handle audio transcription. you can create a Socket.io event on the server to receive the audio data and perform the transcription using appropriate speech recognition libraries or APIs.