How to add two numbers having same units in python

77 Views Asked by At

I have a pandas dataframe where one column has values in kiloTon as abbreviated 'kt'. Now when I perform groupby on Country column and year column and call aggregation function sum on Value column, it's not actually doing sum of values in value column.

The dataset

After performing above action, following is coming:

After groupby n aggregation

However the expected output should be:

enter image description here

Also the 'Value' column is of type object.

Any help will be useful.

1

There are 1 best solutions below

0
user19077881 On

If you are using values with mixed numbers and letters then they will be strings of Pandas dtype object. You need to split of the numerical part, convert to an integer, put into a new column and then use groupby with sum or whatever. For example:

import pandas as pd

df = pd.DataFrame({'Country': ['Algeria', 'Algeria','Algeria','Angola', 'Angola'],
                   'Item': ['Wheat and products', 'Wheat and products','Wheat and products','Wheat and products','Wheat and products'],
                   'Year': [2004, 2004,2005,2004,2004],
                   'Value':['2731 kt', '2415 kt','2688 kt','2000 kt','1111 kt']
                   })

df['ValNum'] = df['Value'].str.extract(r"(\d+)").astype('int')

df2 = df.groupby(['Country', 'Year'])['ValNum'].sum()

print(df2)

gives:

Country  Year
Algeria  2004    5146
         2005    2688
Angola   2004    3111