Batching API Requests

2.6k Views Asked by At

I have a list of 1,000 airports I am sending to an API to get flight data for each airport. The API cannot handle the entire list at once even if I delay the calls. I need to place the list of airports into batches of 100 for the API calls to work properly. My code below iterates over the list of airports and sends them one by one to the API. I want to break up the API calls (airport list) and call them in batches of 100 because it's causing errors in the data format when I use the entire 1,000. When I test the API with only 100 airports, all data is returned properly. I'm unsure where to place the batch code in my API call loop.

# Sample dataset for this post
airport = [['HLZN'], ['HLLQ'],['HLLB'],['HLGT'],['HLMS'],['HLLS'],['HLTQ'],['HLLT'],['HLLM']] 

payload = {'max_pages': 500, 'type':'Airline'}
seconds = 1
count = 1

#Create an empty list to hold responses
json_responses = []

#Iterate through list
for airports in airport:
    response = requests.get(apiUrl + f"airports/{airports[0]}/flights",params=payload,
               headers=auth_header)
    if response.status_code == 200:
        print(count, airports)
        count +=1
        for i in trange(100):
            time.sleep(0.01)
    else:
        pass
    results = response.json()
    json_responses.append(response.json())
    sleep(seconds)

I'm not sure where to place batching code inside the API call loop. I'm new to batching API calls and loops in general so any help will be appreciated.

total_count = len(airport)

#Iterate through list
for airports in airport:
    response = requests.get(apiUrl + f"airports/{airports[0]}/flights",params=payload,
               headers=auth_header)
    chunks = (total_count - 1) // 100 + 1
    for i in range(chunks):
        batch = airport[i*100:(i+1)*100] #Tried batch code here
        if response.status_code == 200:
            print(count, airports)
            count +=1
            for i in trange(100):
                time.sleep(0.01)
        else:
            pass
        results = response.json()
        json_responses.append(response.json())
        sleep(seconds)
1

There are 1 best solutions below

0
xprilion On

I believe this is what you're trying to do:

# Sample dataset for this post
airports = [['HLZN'], ['HLLQ'],['HLLB'],['HLGT'],['HLMS'],['HLLS'],['HLTQ'],['HLLT'],['HLLM']] 

payload = {'max_pages': 500, 'type':'Airline'}
seconds = 1

#Create an empty list to hold responses
json_responses = []

# Counter variable
counter = 0

# Chunk size
chunk_size = 100

#Iterate through list
for airport in airports:
    response = requests.get(apiUrl + f"airports/{airports[0]}/flights",params=payload,
               headers=auth_header)
    results = response.json()
    json_responses.append(response.json())

    # Increment counter and check if it is a multiple of the chunk size, if yes, sleep for a defined number of seconds
    counter += 1
    if counter % chunk_size == 0:
        sleep(seconds)