Scrape a Dynamic Website using Java with Selenium?

497 Views Asked by breaktop At 19 February 2022 at 00:02

I'm trying to scrape https://www.rspca.org.uk/findapet#onSubmitSetHere to get a list of all pets for adoption.

I've built web scrapers before using crawler4j but the websites were static.

Since https://www.rspca.org.uk/findapet#onSubmitSetHere is not a static website, how can I scrape it? Is it possible? What technologies should I use and how?

Update:

When you fill in the search form (Select type of pet and Enter postcode/town or county) in the UI, the results are then displayed below the search box.

The red is highlighted as the search bar and the black is highlighted as results.

I'm trying to scrape the results and also the content of each result.

I've had a look at the request the browser makes to retrieve results, but from Chrome dev tools it isn't obvious what the request is being made.

Original Q&A

There are 1 best solutions below

tgdavies On 19 February 2022 at 03:09

You could use Selenium to extract information from the DOM once a browser has rendered it, but I think a simpler solution is to use "developer tools" to find the request that the browser makes when the "search" button is clicked, and try to reproduce that.

In this case that makes a POST to https://www.rspca.org.uk/findapet?p_p_id=petSearch2016_WAR_ptlPetRehomingPortlets&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_petSearch2016_WAR_ptlPetRehomingPortlets_action=search

The body of the POST request contains a lot of parameters, including animalType and location. The content-type of the request is application/x-www-form-urlencoded.

To see these parameters, go to the "Network" tab in chrome dev tools, click on the "findapet" request (it's the first one in the list when I do this), and click on the "payload" tab to see the query string parameters and the form parameters (which contains animalType and location)

The response contains HTML.

I would try making a request to that endpoint and then parsing the HTML in the response.

Scrape a Dynamic Website using Java with Selenium?

There are 1 best solutions below

Related Questions in JAVA

Related Questions in SELENIUM

Related Questions in WEB-CRAWLER

Related Questions in CRAWLER4J

Trending Questions

Popular # Hahtags

Popular Questions