How to parse html that has the updated DOM? Swift

258 Views Asked by aadi sach At 19 August 2021 at 02:54

I am fairly knew to coding and am parsing html data from a webiste. The problem is that the elements that I can manually inspect when I view the website are very different from the source code. I understand this is because 'inspecting elements' show the state of the DOM tree after the browser has applied its error correction and after any Javascript have manipulated the DOM.

Here is the relevant code:

import SwiftSoup

        
let url = URL(string: link)

let task = URLSession.shared.dataTask(with: url!) { [self] (data, response, error) in            
    do {
        let htmlContent = NSString(data: data!, encoding: String.Encoding.utf8.rawValue)
        let doc: Document = try SwiftSoup.parse(htmlContent! as String)

        let elements = try doc.getAllElements().array()                    
                    
    } catch Exception.Error(type: let type, Message: let message) {
        print(type)
        print(message)
    } catch {
        print("error")
    }
                
}

My question is; what can I do to parse the elements of the websites that appear when I inspect them manually? Sorry if this is a beginner question.

Original Q&A

There are 1 best solutions below

Alex On 19 August 2021 at 06:05

As you noticed the webpage after being loaded in a browser is different when you request the page in code. That is because some web pages will load data or other html 'lazily' when it is needed to improve performance.

To get this html in code, you need to analyze the network 'XHR' tab in the developer tools of your browser. You should be able to find the missing html there.

How to parse html that has the updated DOM? Swift

There are 1 best solutions below

Related Questions in JAVASCRIPT

Related Questions in SWIFT

Related Questions in HTML-PARSING

Related Questions in NSURLSESSIONDATATASK

Related Questions in SWIFTSOUP

Trending Questions

Popular # Hahtags

Popular Questions