Column with ranking based on other columns but like tie breaker

Question

Column with ranking based on other columns but like tie breaker

62 Views Asked by Michał Surdej At 05 January 2024 at 11:37

I've got a dataframe:

df = pd.DataFrame({
    "DScore": [2, 2, 3, 4, 5],
    "EScore": [6, 7, 9, 9, 10],
    "Total Score": [17, 15, 15, 23, 25]
})

I want to write the code that will create a ranking column containing the classification of rows in the table based on the 'Total Score' column. If these values are equal - you should pay attention to the values of EScore points, if they are equal, then it will be based on the values from the DScore column, and if these are also equal - we will assign them the same value. Expected result:

df = pd.DataFrame({
    "DScore": [2, 2, 4, 4, 5],
    "EScore": [6, 7, 9, 9, 10],
    "Total Score": [17, 15, 23, 23, 25],
    "Rank": [3,4,2,2,1]
})

Original Q&A

There are 3 best solutions below

mozway On 05 January 2024 at 12:35

Define the order of columns, then sort_values and groupby.ngroup:

# desired order of tie-breakers
cols = ['Total Score', 'EScore', 'DScore']

df['Rank'] = (df.sort_values(by=cols, ascending=False)
                .groupby(cols, sort=False).ngroup().add(1)
             )

Output:

   DScore  EScore  Total Score  Rank
0       2       6           17     3
1       2       7           15     4
2       4       9           23     2
3       4       9           23     2
4       5      10           25     1

Laurent B. On 05 January 2024 at 17:28

You could also use the pandas rank function

df['Rank']= df['Total Score'].rank(ascending=False).astype(int)

   DScore  EScore  Total Score  Rank
0       2       6           17     4
1       2       7           15     5
2       4       9           23     2
3       4       9           23     2
4       5      10           25     1

Note you can modify the ranking method (5 values) :

method{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’

And with parameter set to dense it "compresses the counting" :

df['Rank']= df['Total Score'].rank(ascending=False, method='dense').astype(int)

   DScore  EScore  Total Score  Rank
0       2       6           17     3
1       2       7           15     4
2       4       9           23     2
3       4       9           23     2
4       5      10           25     1

**Panda Kim** · Accepted Answer · 2024-01-05T11:44:24.547000

For example, you can multiply the EScore by 0.01 and the DScore by 0.001 to weight them more lightly. Then, you can add these values to Total Score and calculate the rank.

rank with method=dense & ascending=False

df['Rank'] = df['Total Score'].add(df['EScore'].mul(0.01)).add(df['DScore'].mul(0.0001)).rank(ascending=False, method='dense').astype('int')

df

   DScore  EScore  Total Score  Rank
0       2       6           17     3
1       2       7           15     4
2       4       9           23     2
3       4       9           23     2
4       5      10           25     1

The example you provided is sufficient with 0.01 and 0.0001, but these numbers should be adjusted to fit the dataset.

Column with ranking based on other columns but like tie breaker

There are 3 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in RANK

Trending Questions

Popular # Hahtags

Popular Questions