ROC for multiclass classification using sklearn

109 Views Asked by At

I am trying to generate a ROC curve for data that is highly imbalanced and multiclass (I know this is not ideal, it is requested by a reviewer for the paper). SKlearn have an option for this here: https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html

The specific code I am using is this:

RocCurveDisplay.from_predictions(
    y_onehot_test.ravel(),
    y_score.ravel(),
    name="micro-average OvR",
    color="darkorange",
    plot_chance_level=True,
)
plt.axis("square")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Micro-averaged One-vs-Rest\nReceiver Operating Characteristic")
plt.legend()
plt.show()

I am confused about the averaging: The title includes the information that we use "micro averaged ovr", but where do I actually give this information to the function?

y_onehot_test looks like this: 1 1 1 0 0 ...

and y_score looks like this: 0.783307 0.832748 0.619186 0.645178 0.654100 ...

Thanks for any insights and explanations :)

1

There are 1 best solutions below

0
orly064 On

If anyone in the future has this same question - the answer is in understanding better micro-averaging. Micro-average gives each sample equal weight, thus there is no need for class information in this case. If you do want to give different weights by class size, weighted averaging is needed.