Thank you for your attention,and sorry for my poor english. I have been trying to get html from https://www.skiddle.com/festivals/dates.html without any success. I understand, that some parts download by js script, but I don't know how to get it. I've also tried to use 'session' but stay with the same results. Pls, advice me about what I need to use in code or what I need to explore.
thanks in advance!!!
There is my code
import requests
from bs4 import BeautifulSoup
import lxml
from selenium import webdriver
import time
import undetected_chromedriver
import json
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0'
}
proxies = {
'https': 'http://146.247.105.71:4827'
}
def get_location(url):
response = requests.get(url, headers=headers, proxies=proxies)
soup = BeautifulSoup(response.text, 'lxml')
print(soup, '\n\n\nlox\n\n\n')
# options = undetected_chromedriver.ChromeOptions()
# options.add_argument('--proxy-server=146.247.105.71:4827')
# driver = undetected_chromedriver.Chrome(
# options=options
# )
# driver.get(url)
# time.sleep(5)
# response = driver.page_source
# driver.close()
# driver.quit()
# print(response)
def main():
get_location(url='https://www.skiddle.com/festivals/dates.html')
if __name__ == '__main__':
main()
I need links on each feastival's page.
Here is an example how you can print festival name + URL:
Prints: