Having trouble downloading content with requests

32 Views Asked by At

I'm trying to use the Python requests module to download an mp4 video off a website behind a paywall. Here's my code:

link_href = # Found the mp4 link on the website
with open('D:/filename','wb') as f:
    response = requests.get(link_href)
    f.write(response.content)

I looked at the response.content and it's the html for the site's login page. How do I get the mp4?

1

There are 1 best solutions below

3
Adri Sir On

ok, seems like you're trying to download an mp4 video from a website that requires login first. The website's login page HTML is returned because you're not providing any login credentials in your request. To get through the paywall, you need to authenticate yourself using the correct login procedure, such as sending a request with your login credentials to the login endpoint.

Below, ill show you a general outline of how to handle the situation using the requests module example of the solution:

    import requests
from bs4 import BeautifulSoup

# Change these variables with your actual login information
login_url = "https://www.example.com/login"
username = "your_username"
password = "your_password"
mp4_link = "https://www.example.com/video.mp4"  # Found the mp4 link on the website

# Step 1: Visit the login page
with requests.Session() as session:
    login_page_response = session.get(login_url)

# Step 2: Parse the login page to get CSRF token or other required information (if necessary)
soup = BeautifulSoup(login_page_response.content, "html.parser")
csrf_token = soup.find("input", {"name": "_csrf"})["value"]  # Example of getting CSRF token

# Step 3: Prepare the login data
login_data = {
    "username": username,
    "password": password,
    "_csrf": csrf_token,
}

# Step 4: Send a POST request to the login endpoint with the login data
login_response = session.post(login_url, data=login_data)

# Step 5: Verify if the login is successful by checking the response
# (e.g., by checking if the redirected URL is the expected page after login)

# Step 6: Download the mp4 file
if login_response.status_code == 200:
    video_response = session.get(mp4_link)
    if video_response.status_code == 200:
        with open('D:/filename.mp4','wb') as f:
            f.write(video_response.content)

You can have an idea of how you can fix your code with this above, if any errors , reply me