Web scraping with Python-Windmill (How to accurately wait till a page fully loads)

1.3k Views Asked by Renl At 21 January 2012 at 06:27

I have been playing around with windmill to try out some web scraping, however the API waits.forPageLoad is not able to check if the page is fully rendered.
And in a scenario where I need to reload a page with an existing DOM and I use waits.forElement to detect the DOM for the script to "decide" that the page has loaded. This would occasionally detect the DOM even before the page has loaded.
Also loading a page with windmill test client in firefox seems to take forever. The same page if I load with my regular firefox browser may take like 2 seconds but may take up to a minute in the test client. Is it normal for it to take so long?
Lastly I was wondering if there are better alternatives to windmill for webscraping? The documentation seems abit sparse.

Please advice. Thanks :P

There are 1 best solutions below

TangibleDream On 10 April 2012 at 17:44

 client.waits.sleep(milliseconds=u'2000')

an absolute pause of 2 seconds.

 client.waits.forPageLoad(timeout=u'20000')

Will wait on future lines until the page loads or until 20 seconds have elapsed, which ever comer first. Think of it as a time bordered assert. If the page loads in under 20 seconds pass, if not fail.

I hope this helps,

Web scraping with Python-Windmill (How to accurately wait till a page fully loads)

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in WINDMILL

Trending Questions

Popular # Hahtags

Popular Questions