Pause Discrepencies In Azure Speech Studio and Speech SDK

50 Views Asked by At

I have the following ssml

<speak xmlns:ns0="http://www.w3.org/2001/10/synthesis" version="1.0" xml:lang="en-US"><voice name="en-US-GuyNeural"><prosody contour="(15%, +96%) (37%, -5%) (61%, -98%) (94%, +98%)">Hello Jacob</prosody><s>This is a test</s><prosody contour="(17%, +99%) (77%, -96%)">script</prosody></voice></speak>

When using the speech sdk, it has a break between "This is a test" and "script":

import azure.cognitiveservices.speech as speechsdk
import os

# Configure the synthesizer
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
speech_config.speech_synthesis_language = "en-US" 
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

# Create ssml variable
ssml = '<speak xmlns:ns0="http://www.w3.org/2001/10/synthesis" version="1.0" xml:lang="en-US"><voice name="en-US-GuyNeural"><prosody contour="(15%, +96%) (37%, -5%) (61%, -98%) (94%, +98%)">Hello Jacob</prosody><s>This is a test</s><prosody contour="(17%, +99%) (77%, -96%)">script</prosody></voice></speak>'

# Create speech
speech_synthesis_result = speech_synthesizer.speak_ssml_async(ssml).get()
stream = speechsdk.AudioDataStream(speech_synthesis_result) 

I've tried uploading this same ssml to azure speech studio and it worked correctly without the pause: azure speech studio ssml

How do I get rid of the pause when using the SDK?

0

There are 0 best solutions below