I am running a grid search of leave-one-out for a random forest model. I used f1 score to get the best estimator and score. From here forward, how can I get the precision and recall score so that I can plot the precision-recall curve? X is the sample dataset and y is the target.
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import LeaveOneOut
RF = RandomForestClassifier()
param_grid = {
'n_estimators': [10,20,30,50],
'criterion': ['gini', 'entropy'],
'max_depth': [10, 20, 30, None]}
grid_search = GridSearchCV(RF,
param_grid=param_grid,
cv = LeaveOneOut()
score='f1_score')
grid_search.fit(X, y)
You can collect the predictions from your model in an array and use it to calculate the data for the precision-recall curve (or any other performance metric you need):
It is highly recommended you split your dataset and use the majority of it to train the model, and leave some data just for testing the performance. That way, the ability to generalize with unseen data is checked.