I am fairly knew to coding and am parsing html data from a webiste. The problem is that the elements that I can manually inspect when I view the website are very different from the source code. I understand this is because 'inspecting elements' show the state of the DOM tree after the browser has applied its error correction and after any Javascript have manipulated the DOM.
Here is the relevant code:
import SwiftSoup
let url = URL(string: link)
let task = URLSession.shared.dataTask(with: url!) { [self] (data, response, error) in
do {
let htmlContent = NSString(data: data!, encoding: String.Encoding.utf8.rawValue)
let doc: Document = try SwiftSoup.parse(htmlContent! as String)
let elements = try doc.getAllElements().array()
} catch Exception.Error(type: let type, Message: let message) {
print(type)
print(message)
} catch {
print("error")
}
}
My question is; what can I do to parse the elements of the websites that appear when I inspect them manually? Sorry if this is a beginner question.
As you noticed the webpage after being loaded in a browser is different when you request the page in code. That is because some web pages will load data or other html 'lazily' when it is needed to improve performance.
To get this html in code, you need to analyze the network 'XHR' tab in the developer tools of your browser. You should be able to find the missing html there.