ROC Curve Plot using R (Error code: Predictor must be numeric or ordered)

708 Views Asked by At

I am trying to make a ROC Curve using pROC with the 2 columns as below: (the list goes on to over >300 entries)

Actual_Findings_% Predicted_Finding_Prob
0.23 0.6
0.48 0.3
0.26 0.62
0.23 0.6
0.48 0.3
0.47 0.3
0.23 0.6
0.6868 0.25
0.77 0.15
0.31 0.55

The code I tried to use is:

roccurve<- plot(roc(response = data$Actual_Findings_% <0.4, predictor = data$Predicted_Finding_Prob >0.5),
                legacy.axes = TRUE, print.auc=TRUE, main = "ROC Curve", col = colors)

Where the threshold for positive findings is Actual_Findings_% <0.4 AND Predicted_Finding_Prob >0.5 (i.e to be TRUE POSITIVE, actual_finding_% would be LESS than 0.4, AND predicted_finding_prob would be GREATER than 0.5)

but when I try to plot this roc curve, I get the error:

"Setting levels: control = FALSE, case = TRUE
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'plot': Predictor must be numeric or ordered."

Any help would be much appreciated!

1

There are 1 best solutions below

2
Sirius On BEST ANSWER

This should work:


data <- read.table( text=
"Actual_Findings_%  Predicted_Finding_Prob
0.23    0.6
0.48    0.3
0.26    0.62
0.23    0.6
0.48    0.3
0.47    0.3
0.23    0.6
0.6868  0.25
0.77    0.15
0.31    0.55
", header=TRUE, check.names=FALSE )

library(pROC)

roccurve <- plot(
    roc(
        response = data$"Actual_Findings_%" <0.4,
        predictor = data$"Predicted_Finding_Prob"
    ),
    legacy.axes = TRUE, print.auc=TRUE, main = "ROC Curve"
)

Now importantly - the roc curve is there to show you what happens when you varry your classification threshold. So one thing you do do wrong is to go and enforce one, by setting predictions < 0.5

This does however give a perfect separation, which is nice I guess. (Though bad for educational purposes.)