The page I'm trying to scrape is https://www.biggerpockets.com/forums/88/topics/895460-cap-rate-vs-interest-rate
The xpath in the developer console returns the text element which corresponds to the title of the post
However, when running the scrapy, the same xpath doesn't work and the title returns 'None'
yield SplashRequest("https://www.biggerpockets.com/forums/88/topics/895460-cap-rate-vs-interest-rate", self.parse_post, args={'wait': 2})
def parse_post(self, response):
title = response.xpath('//div[contains(@class, "simplified-forums__discussion")]//div[contains(@class, "simplified-forums__discussion__first-post")]//div[contains(@class, "simplified-forums__card__content")]//h1/text()').get()
print(title)
2023-11-01 00:16:11 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.biggerpockets.com/forums/49/topics/276013-interest-rate>
None
When I access
http://localhost:8050/render.html?url=https://www.biggerpockets.com/forums/88/topics/895460-cap-rate-vs-interest-rate
the page renders fine as well, not sure what exactly is wrong, because I am confident that the xpath is correct.
If I am missing anything, please help me out
As I mentioned in the comment, your xpath seems to be wrong.