Cannot find html element using css or xpath selectors in Scrapy

115 Views Asked by Chris At 08 May 2023 at 20:17

I'm using Scrapy to scrape this website. I want to grab all the div elements with class="data1". I'm using css and xpath selectors to do so. However, I cannot find these elements using css and xpath selectors even though I can see them in the html code in the browser.

In the scrapy shell after fetching the url:

In [6]: response.css('div#my_div')
Out[6]: [<Selector query="descendant-or-self::div[@id = 'my_div']" data='<div id="my_div">Results will be show...'>]

In [7]: response.css('div#my_div div')
Out[7]: []

In [8]: response.xpath('//div[@class="data1"]')
Out[8]: []

The html looks something like this:

<div id="my_div" style>
 <div class="data1">...</div>
 <div class="data1">...</div>
 <div class="data1">...</div>
 ...
</div>

Original Q&A

There are 1 best solutions below

Alexander On 09 May 2023 at 03:58 BEST ANSWER

This is because that portion of the site is rendered with javascript. You can see this if you were to call .get() on your first query in your example:

In [1]: response.css('div#my_div').get()

Out[1]: '<div id="my_div">Results will be shown here.</div>'

If you investigate by looking in the network tab of the browser developer tools you can discover that all that information is coming from an api call to 'https://data.crn.com/2023/wotc2023.php?st1=1&st2=a' which when fetched via scrapy shell returns a json object with all the information in that section.

In [3]: fetch('https://data.crn.com/2023/wotc2023.php?st1=1&st2=a')
2023-05-08 20:57:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://data.crn.com/2023/wotc2023.php?st1=1&st2=a> (referer: None)

In [4]: response.json()
Out[4]: 
[{'Pkey': '617',
  'Company': 'F5',
  'Name_First': 'Barbara',
  'Name_Last': 'Abboud',
  'Image': 'f5-abboud-barbara.jpg'},
 {'Pkey': '1208',
  'Company': 'Samsung Electronics America',
  'Name_First': 'Shpresa',
  'Name_Last': 'Abdullaj',
  'Image': 'samsung-electronics-america-abdullaj-shpresa.jpg'},
 {'Pkey': '499',
  'Company': 'Davenport Group',
  'Name_First': 'Kim',
  'Name_Last': 'Abrams',
  'Image': 'davenport-group-abrams-kim.jpg'},
 {'Pkey': '35',
  'Company': 'Alteryx',
  'Name_First': 'Daniella',
  'Name_Last': 'Aburto Valle',
  'Image': 'alteryx-aburto-valle-daniella.jpg'},
  .......]

Cannot find html element using css or xpath selectors in Scrapy

There are 1 best solutions below

Related Questions in WEB-SCRAPING

Related Questions in SCRAPY

Related Questions in CSS-SELECTORS

Related Questions in SCRAPY-SHELL

Trending Questions

Popular # Hahtags

Popular Questions