following a youtube tutorial on scraping youtube views and video date [https://www.youtube.com/watch?v=Cc3mMH8XWC4]
I made a dataframe of every video, it has the views, clean_views, video_url, video_age and title of over 1000 videos, while following a previous tutorial.
(https://www.youtube.com/watch?v=2s6Oxh3JkG0) video I'm trying to parse. if I can successfully get the views and video_date, I plan to loop through all of my dataframe, to update the views and video_date.
Youtube has updated their viewcount and made it difficult to parse, compared to the tutorial I'm watching.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
-Selenium to webscrape. -chromeservice and ChromeDriverManager for webdriver. -Imported Keys so machine presses "End" button to go all -the way down the page, to load more vids. -Imported By so I dont get "name By is not found" error.
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()))
for url in youtube_df['video_url']:
driver.get(url)
break
telling driver to go through all the urls in my dataframe. Its just focusing on the first one(which is this vid(https://www.youtube.com/watch?v=2s6Oxh3JkG0)), cause I wrote break.
html = driver.page_source
saving everything driver knows about page into variable called html
soup = BeautifulSoup(html,'html.parser')
using beautiful soup to parse 'html' variable
soup.find_all('div',{'id':'view-count','class':'style-scope ytd-watch-info-text'})
this is where I thought the view count was. In the Arial label it told me the views. I think theres code in there to make the Arial Hidden.
What can I do to get the view count of video, and get the exact date of the video?
tried looking it up on Youtube, all the tutorials are too old.
The error you're encountering suggests that the
('div',{'id':'view-count','class':'style-scope ytd-watch-info-text'}call is returning None, meaning it's not finding any elements matching the given criteria. This is likely because the class name'style-scope ytd-watch-info-text'isn't unique meaning your required data isn't inside your selection.You don't need to apply for loop to get the correct html element selection. You can try to get the desired data by applying the element selection from my code. Now it's working fine.Script:
Output: