How do I remap a Python Pandas Series of ints to 0,1,2,3,4 based on their order from smallest to largest?

52 Views Asked by At

I would like to remap a Python Pandas Series of ints to 0,1,2,3,4,... based on their order from smallest to largest. Equal ints should be mapped to the same int.

For example, if I have a pandas Series [1, 1, 4, 4, 7, 12, 18, 18], I would like it mapped to [0, 0, 1, 1, 2, 3, 4, 4]. Basically it's like squishing the ints so that they're next to each other.

I've tried converting to a standard list and using a naive implementation, but wondering if there's a more idiomatic way to do it.

2

There are 2 best solutions below

1
Andrej Kesely On BEST ANSWER

You can use pd.Categorical:

s = pd.Series([1, 1, 4, 4, 7, 12, 18, 18])

print(pd.Categorical(s).codes)

Prints:

[0 0 1 1 2 3 4 4]
0
Corralien On

Use rank with dense method:

>>> sr.rank(method='dense').sub(1).astype(int)
0    0
1    0
2    1
3    1
4    2
5    3
6    4
7    4
dtype: int64

Or pd.factorize:

>>> pd.factorize(sr)[0]
array([0, 0, 1, 1, 2, 3, 4, 4])

Or np.unique:

>>> np.unique(sr, return_inverse=True)[1]
array([0, 0, 1, 1, 2, 3, 4, 4])

There are lots of ways to do this with Pandas/Numpy including with Categorical posted by @AndrejKesely.