Why does Pivot table return Int64 Type Error?

1.6k Views Asked by dataaaaaaaaa At 30 April 2022 at 11:34

I'm trying to pivot a dataframe but it keeps returning an Int64 Error. A similar question was not actually answered - What causes these Int64 columns to cause a TypeError?

Here's the type of my dataframe:

#   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   price         30159 non-null  Int64 
 1   type          30159 non-null  object
 2   size          30155 non-null  Int64 
 3   location      30159 non-null  object
 4   neighborhood  30159 non-null  object
dtypes: Int64(2), object(3)

the pivot table code:

pfraw = pd.pivot_table(pfraw, values = 'price', index = 'neighborhood', columns = 'type')

and the lat bit of the error message:

    273     dtype = np.dtype(dtype)
    275 if not isinstance(dtype, np.dtype):
    276     # enforce our signature annotation
--> 277     raise TypeError(dtype)  # pragma: no cover
    279 converted = maybe_downcast_numeric(result, dtype, do_round)
    280 if converted is not result:

TypeError: Int64

I dont understand why would it return an error with Int64.

Original Q&A

There are 1 best solutions below

Gonçalo Peres On 30 April 2022 at 12:26 BEST ANSWER

First of all, let's create a df similar to the one OP has

import pandas as pd

df = pd.DataFrame( {'price': [10, 12, 18, 10, 12], 'type': ['A', 'A', 'A', 'B', 'B'], 'size': [10, 12, 18, 10, 12], 'location': ['A', 'A', 'A', 'B', 'B'], 'neighborhood': ['A', 'A', 'A', 'B', 'B']})

If one prints the df one will see that this one has int64 and not Int64 (as opposed to OP's). Note: On my answer here one finds the difference between the two dtypes.

print(df.info(verbose=True))

[Out]:

 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   price         5 non-null      int64 
 1   type          5 non-null      object
 2   size          5 non-null      int64 
 3   location      5 non-null      object
 4   neighborhood  5 non-null      object

And, with an int64 one will be able to create the pivot table with index "neighborhood", columns "type", and values "price", with the following

df_pivot = df.pivot_table(index='neighborhood', columns='type', values='price')

This is the output

type                  A     B
neighborhood                 
A             13.333333   NaN
B                   NaN  11.0

However, with Int64 the Pivot Table can generate an error.

In order to handle that, one will need convert the type to int64

df[['price', 'size']] = df[['price', 'size']].astype('int64')

import numpy as np

df[['price', 'size']] = df[['price', 'size']].astype(np.int64)

Also, most likely, OP has missing values. The fastest way to handle that is to remove the rows with missing values. In order to find and remove the missing values, my answer here may be of help.

For the reference, this is a direct link to the module maybe_downcast_to_dtype that is raising the error that OP is having.

Why does Pivot table return Int64 Type Error?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in PIVOT-TABLE

Related Questions in INT64

Trending Questions

Popular # Hahtags

Popular Questions