when doing dividing operations in pandas, i always get 'NAN' results. how can I solve the problem?

96 Views Asked by Paul At 06 August 2023 at 17:10

I want to make column 'ratio' that is the result after each value of the column 'amount' divides the last value of the column 'amount'. the data type of amount column is int64. After changing the data type to float, I also got the same 'NAN' value.

Original Q&A

There are 4 best solutions below

Vitalizzare On 06 August 2023 at 17:27 BEST ANSWER

When you do any math on several data frames or sequences, Pandas aligns on indexes and columns by default. tail(1) returns not a single value (scalar) but a sequence with the last index of the original data. When you divide the column on the obtained sequence, data are merged on indexes and then divided on corresponding values. Since tail contains only the value with the last index, the merge ends up with nan values as corresponding divisors for all dividends except the last one. That's why you got nan everywhere except at the last position.

To avoid this behavior, pass the divisor either as a number or a numpy.array. In this case, it can be

dt['amount'] / dt['amount'].tail(1).values    # divide on a numpy.array
dt['amount'] / dt['amount'].iloc[-1]          # divide on a number

RomanPerekhrest On 06 August 2023 at 17:16

Instead of tail specify the location of the last value:

df['amount'] / df['amount'].iloc[-1]

user2314737 On 06 August 2023 at 17:27

You could use shift like this:

import pandas as pd

data = {'amount': range(4,8), 'user_input': ['a', 'b', 'c', 'd']}

dt = pd.DataFrame.from_dict(data)

dt
# Out: 
#    amount user_input
# 0       4          a
# 1       5          b
# 2       6          c
# 3       7          d

dt['ratio'] = dt['amount']/dt['amount'].shift(1)

dt
# Out: 
#    amount user_input     ratio
# 0       4          a       NaN
# 1       5          b  1.250000
# 2       6          c  1.200000
# 3       7          d  1.166667

Note that if you have a division by zero you will get an inf and of course the first value in the 'ratio' column is undefined.

ragas On 06 August 2023 at 17:42

A different take of same approach:

import pandas as pd

data = {
    'col1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
    'col2': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
}

# Convert data into DataFrame
df = pd.DataFrame(data)
df = df.assign(new_col = df['col2']/df['col2'].values[-1])
print(df)

when doing dividing operations in pandas, i always get 'NAN' results. how can I solve the problem?

There are 4 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in NAN

Related Questions in CALCULATION

Related Questions in DIVIDE

Trending Questions

Popular # Hahtags

Popular Questions