How to Improve Performance of API Calls in Pandas DataFrame Loop?

22 Views Asked by At

I have a DataFrame df_flights where each row represents input data for making an API call. The DataFrame is quite big (can be 1000 rows). For each row, I need to extract values from every column to make an API call, which returns a new row for a new DataFrame.

Here's a simplified version of my current approach:

# Initialize an empty DataFrame
df_bereik = pd.DataFrame()

# Loop through each row of df_flights
for index, row in df_flights.iterrows(): 
    # Make an API call using values from the current row
    df = get_new_df(index, row, variable, token)
    
    # Merge the result DataFrame with df_bereik
    df_bereik = pd.concat([df_bereik, df], ignore_index=True)

def get_new_df(index, row, target_group, token):
    advertiser = row["advertiser"]
    product = row["product"]
    etc.

    df = call_api(variable, advertiser, product)

    #I also need to clean the api data
    df = clean_up(df, advertiser, product, variable, etc.)

    return df

def clean_up(df):
    do something
    return df

Now this is all very slow due to iterrows. But I can't seem to make something work that returns the same result and is faster. Is there any other way?

Thanks!

0

There are 0 best solutions below