I am developing a chrome extension that utilizes 'microsoft-cognitiveservices-speech-sdk' to convert the selected text on any webpage into speech. The core code of this chrome extension is below:
import { SpeakerAudioDestination } from "microsoft-cognitiveservices-speech-sdk";
import { getSettings } from "./utils/setting";
let player: SpeakerAudioDestination | null = null;
export const textToSpeech = async (
text
) => {
if (player) {
player.pause();
player.close();
}
const voiceName = await getSettings("voice");
return new Promise( async(resolve, reject) => {
const voiceName = await getSettings("voice");
const speechConfig = window.SpeechSDK.SpeechConfig.fromSubscription(...);
speechConfig.speechRecognitionLanguage = "...";
speechConfig.speechSynthesisVoiceName = voiceName;
player = new window.SpeechSDK.SpeakerAudioDestination();
const audioConfig = window.SpeechSDK.AudioConfig.fromSpeakerOutput(player);
const speechSynthesizer = new window.SpeechSDK.SpeechSynthesizer(speechConfig, audioConfig);
speechSynthesizer.speakTextAsync(
text,
result => {
resolve(result);
speechSynthesizer.close();
},
error => {
console.log(error);
reject(error);
speechSynthesizer.close();
}
);
});
};
However, I encountered an issue where some websites have set up Content Security Policy that prevents the extension from playing speech. For example, the error message below enclosed in [] is from a specific website:
[Refused to load media from 'blob:https://developer.mozilla.org/57fb5680-66f6-49f8-b953-0fb9f2b140df' because it violates the following Content Security Policy directive: "media-src 'self' archive.org videos.cdn.mozilla.net".]
How can I solve this? If the SpeechSDK.AudioConfig.fromSpeakerOutput(player) method generates a blob URL that violates the webpage's security policy, is there another way to resolve this issue? For instance, can we use audio stream playback instead? I am not very familiar with this area.
Code:
Output: Local output:
With the above port, I can hear the speech of the text in the browser: