Making an apply function faster in Python

331 Views Asked by Benn At 05 June 2020 at 15:50

I am running the following code on about 6 million rows. It's so slow and never ends.

df['City'] = df['POSTAL_CODE'].apply(lambda x: nomi.query_postal_code(x).county_name)

It assigns a corresponding city to each postal code. When I run it on a slice of dateset(e.g, 1000 rows) it works well. But running the code on the whole data never gives me any output.

Can anyone modify the code to make it faster?

Thank you!

Original Q&A

There are 1 best solutions below

DejaVuSansMono On 05 June 2020 at 16:03

!pip3 install multiprocess

from multiprocess import Pool

def parallelize_dataframe(data, func, n_cores=4):
       data_split = np.array_split(data, n_cores)
       pool = Pool(n_cores)
       data = pd.concat(pool.map(func, data_split))
       pool.close()
       pool.join()
       return data


df['City'] = parallelize_dataframe(df['POSTAL_CODE'], lambda x: nomi.query_postal_code(x).county_name, 4)

Making an apply function faster in Python

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in FUNCTION

Related Questions in LOOPS

Related Questions in APPLY

Related Questions in CPU-SPEED

Trending Questions

Popular # Hahtags

Popular Questions