The code I used was
import scrapy
class JobSpider(scrapy.Spider):
name = 'job'
start_urls = [
'https://jobs.goodlifefitness.com/listjobs/'
]
In the scrapy shell I put the following code for the link:
response.css('div.jobTitle a::attr(href)')
and I got a " [ ] "
It is because the entire page is rendered from
javascript. Once you fetch the request, if you were to open a local file and paste the html content, you will see that 99% of the html is<script>tags. Fortunately these types of pages are easy to scrape with therequests-htmllibrary (not to be confused with therequestslibrary).For example:
pip install requests-htmlOUTPUT