Extracting subtitle from a teachable/hotmart video

131 Views Asked by At

After may hours, I found a way how to extract the subtitle text from a teachable/hotmart video in a half-automatic way. Is there any way to automate the process?

The problem for a more sophisticated solution is that the video data is referenced inside an <iframe>, so accessing its contents from outside is prevented by CORS rules. A further problem is that the video containing the subtitle cannot be loaded with any normal user interface provided.

My process is the following steps:

  1. Open the video e.g. in Chrome or Edge
  2. Stop the video and move to the place where you want to start subtitle extraction (the first subtitle to be recorded should not yet be shown)
  3. Open the developer panel with context menu "Inspect"
  4. In the "Elements" tab, follow down the elements tree for those elements, which highlight the video area, until you find an "iframe" node with "src" parameter value starting with "https://player.hotmart.com"
  5. Under that "iframe" node, open nodes #document/<html>/<head>
  6. Under that <head>, replace the empty script node (<script></script>) with the code below, using e.g. "Edit as HTML" context menu
  7. Clear the console tab
  8. Let the video run the whole part that you want to extract the subtitle from
  9. Save the console log as a text file
  10. Remove the unnecessary file and rownumber using a text editor in the saved console log

Code to be replaced with:

<script>
var txt="";
(function checkSend() {
    var pres = document.getElementsByTagName("pre");
    if (pres.length>0) {
        var t=pres[0].innerHTML;
        if (t!=txt) {
            txt=t;
            window.console.log(t);
        }
    }
    setTimeout(checkSend, 1000);
})();
</script>

0

There are 0 best solutions below