I can't figure out how to load a WebView2 document into HTML Agility Pack. I'm using JavaScript to get the DOM as a string. However, when I load the DOM string into an HtmlAgilityPack document, every attempt to parse it returns null.
This compiles:
string dom = await webView21.CoreWebView2.ExecuteScriptAsync("document.body.outerHTML"); // Get the DOM with JavaScript
if (dom.Contains("div"))
System.Diagnostics.Debug.WriteLine("At least one div in the DOM"); // Prints
HtmlAgilityPack.HtmlDocument htmlDocument = new HtmlAgilityPack.HtmlDocument();
htmlDocument.LoadHtml(dom);
var divs = htmlDocument.DocumentNode.SelectNodes("//div");
if (divs == null)
System.Diagnostics.Debug.WriteLine("divs is null"); // Prints
When I run this snippet, the first if clause confirms that the string dom contains at least one div. However, when the string is loaded into the htmlDocument, the second if clause shows that the variable divs is null. The variable divs should have a count of at least 1. I'm doing something stupid, but I don't know what.
Getting the DOM with JavaScript leaves unicode characters in the string dom, ie. “\u003C” in place of “<“. After getting the DOM, these can be removed with
That answers the question.
As an aside, using "documentElement" instead of "body" gets more of the dom, ie.