I am using the Google Cloud Speech API with Python, but I'm encountering an issue where the response from the API does not include the transcript. Here is my code:
from google.cloud import texttospeech, speech
audio_stream = rtc.AudioStream(track)
print("audio_stream",audio_stream)
client = speech.SpeechClient()
async for audio_frame in audio_stream:
print("audio_frame_data",audio_frame.data)
# Convert audio frame to bytes
audio_data = audio_frame.data.tobytes()
config = {
"encoding": speech.RecognitionConfig.AudioEncoding.LINEAR16,
"sample_rate_hertz": 16000,
"language_code": "en-US",
}
audio= speech.RecognitionAudio(content=audio_data)
response = client.recognize(config=config, audio=audio)
print("response",response)
The response I receive from the API only contains the following fields:
audio_stream <livekit.rtc.audio_stream.AudioStream object at 0x7ff41fcd4c10>
audio_frame_data <memory at 0x7ff41fcd8280>
response total_billed_time {}
request_id: unique_id
I expected the response to include the transcript of the audio. Could someone please help me understand why the transcript is not being returned in the response? Any insights or suggestions would be greatly appreciated. Thank you!