How to get the "innerHTML" of a Html element in a downloaded web page?

193 Views Asked by At

I'm trying to create a Google Apps Script function that returns the inner HTML of an ID element of a webpage.

Using the JavaScript console of the web browser, this would do it:

document.getElementById("myID").innerHTML

In Google Apps Script:

function getValue(symbol) {
  
  symbol= 'ABCD';

  const url = `https://example.com/${symbol}`;

  const options = {
    headers: {'Content-Type':'application/xml'},
    method: 'GET'
  };

  const res = UrlFetchApp.fetch(url, options);
  const contentText = res.getContentText();
 
  // ????
}

From the code above, I've managed to extract the whole web page content - but how do I get the inner html of the myID id?

1

There are 1 best solutions below

1
TheMaster On

Server side JavaScript is different from client side JavaScript(Browser). There's no API for window or document or any of it's methods server side. To parse html, you may use server side html parsers like Cheerio. There's a GAS fork by @tani/@3846masa here, which may work for you. There are also differences between how the server renders the html/js vs what is available in the downloaded html. See Scraping data to Google Sheets from a website that uses JavaScript

If you're looking for xml parser, there's a inbuilt one.