I'm currently working on a web project using the SpeechSynthesis API to read paragraphs of text on a webpage. I've been trying to synchronize the spoken words with the color changes in the text, but I'm facing some challenges.
Here's a brief overview of the issue:
- I have a function that reads out the content of
tags on a page using the SpeechSynthesis API.
- The goal is to synchronize the spoken words with color changes in real-time.
- Specifically, I'd like each word to change to red while it's being spoken and revert to the original color when the word is completed.
- every attempt led to the whole paragraph being red.
My working code without the sync is below.
function speakAllParagraphs(page) {
// Get all <p> elements within the current page
var paragraphs = document
.getElementById("page" + page)
.getElementsByTagName("p");
// Iterate through each <p> tag
Array.from(paragraphs).forEach(function (paragraph, index) {
// Speak the text of the paragraph
var text = paragraph.innerText;
// Create a new SpeechSynthesisUtterance
var utterance = new SpeechSynthesisUtterance();
utterance.text = text;
// Find the voice by name
const voices = speechSynthesis.getVoices();
const targetVoice = voices.find(
(voice) =>
voice.name === "Microsoft Emily Online (Natural) - English (Ireland)"
);
if (targetVoice) {
utterance.voice = targetVoice;
} else {
// Fallback: if the target voice is not available, use default voice
utterance.voice = voices[0];
}
// Play the synthesized speech
speechSynthesis.speak(utterance);
});
}
- I attempted to use the onboundary event to change the color of individual words, but it didn't work as expected.
- I've tried a few approaches, including using timers and events, but I haven't been able to achieve the desired synchronization.
- every attempt led to the whole paragraph being red.
- The goal is to have each word change to red while it's being spoken, and revert to the original color when the word is completed.
You can accomplish the required behavior by listening for the
boundaryevent on theSpeechSynthesisUtteranceinstance. This event will give you acharIndexandcharLengthproperty which will indicidate where in the string the utterance is at that specific moment.This allows you to grab a specific part from your string and wrap it in HTML - like a
<mark>tag in the example below - to highlight the current spoken text. Replace the text of the paragraph with the text with that includes the highlight.Also listen for the
endevent to restore the original text in the paragraph when the utterance is finished.I've also included a snippet which handles multiple paragraphs. The difference here is that
speakAndHighlightTextreturns aPromisethat resolves on theendevent, which boils down to that we can await the speech to finish before moving to the next paragraph.