How can I extract more than ten results from Google's custom search API?

101 Views Asked by At

I've managed to wrangle this code together to turn a JSON file returned from a Google custom search engine into a .csv file. It works. The issue is that Google only returns ten results per page.

import csv
import json
import requests

url = ('https://www.googleapis.com/customsearch/v1?key=APIKEY&cx=SEARCGENGINEID&q=QUERY&sort=date:r:20231001:20231102')

response = requests.get(url)
response_dict = response.json()

file = open("Testing.csv", "w", newline='')
header = ['Site', 'Title', 'URL', 'Author', 'Date']
writer = csv.writer(file)
writer.writerow(header)

out_rows = 0

for item in response_dict.get('items', []):
    title = item.get('title', '')
    url = item.get('link', '')

    author = item.get('pagemap', {}).get('metatags', [{}])[0].get('author', '')
    date = item.get('pagemap', {}).get('metatags', [{}])[0].get('article:published_time', '')
    site = item.get('pagemap', {}).get('metatags',[{}])[0].get('og:site_name','')

    data = [site, title, url, author, date]
    writer.writerow(data)
    out_rows += 1

file.close()

print(f"OK! Published file with {out_rows} articles.")

with open("Testing.json", "w") as json_file:
    json.dump(response_dict, json_file, indent=2)

You can get to the next page of results by adding &start=11, &start=21 etc. to the url for the API request. What I would like to do, but don't know how to structure it, is make the computer do that for me with another loop. If out_rows=10, go through the process again and add the additional results to the same sheet until out_rows<10. What would be the easiest way to do this?

0

There are 0 best solutions below