I am doing hyperparameter Tuning for SVM Model with gird search cv for Heart Attack Analysis & Prediction Dataset Kaggle , While training the model, i am not getting errors but when i checked the gridsearch.cv_results_ i noticed that , it is giving nan values in all rows in split3_test_score in grid_search.cv_results_ attribute ,
i dont have any missing values in the dataset
also i have checked the pipeline also , its also working good
and when i train the model with the same pipeline without gridsearch , its giving results ,
and also i have checked without giving any parameters(default parameters of model only) then also its giving nan values in the gridsearch test scores.
The same model with same parameters is giving results without gridsearch.
Model is training and also its giving 72% of F1 score for grid search but its not giving any errors while training or testing.
this is the code i have written for gridsearch with pipeline
Preprocessing pipeline
preprocessing = ColumnTransformer([("One Hot Encoding", OneHotEncoder(), df_cat.columns), # For categorical features("Scalers", StandardScaler(), df_num.drop("output", axis=1).columns) # For numeric features])
Create the model pipeline
SVC_model = Pipeline([("preprocessing", preprocessing),("SVC", SVC())])
Define the hyperparameters for SVC
svc_params = {
'kernel': ["linear", "poly", "rbf", "sigmoid", "precomputed"],
'C': [0.5, 0.7, 0.8, 0.9, 1, 2, 3, 10],
'probability': [True, False],
'degree': [3, 5, 7, 10],
'gamma': ["scale", "auto"],
'shrinking': [True, False],
##'decision_function_shape': ["ovo", "ovr"],
##For Multiclass category "ovo",# For Binary Classification "ovo:", Default is "ovo"
}
param_grid = {'SVC__' + key: value for key, value in svc_params.items()}
grid_search = GridSearchCV(SVC_model, param_grid, cv=5)grid_search.fit(X_train, y_train)
# Access the cross-validation results from the GridSearchCV object
## TO view cross validation results of grid search
grid_search.cv_results_
data2 = pd.DataFrame(grid_search.cv_results_)
this is the result i got for cv_results_, i am getting nan value in split3_test_score.
{'mean_fit_time': array([0.01162343]),
'std_fit_time': array([0.002411]),
'mean_score_time': array([0.00743232]),
'std_score_time': array([0.00106932]),
'params': [{}],
'split0_test_score': array([0.84]),
'split1_test_score': array([0.82352941]),
'split2_test_score': array([0.79069767]),
'split3_test_score': array([nan]),
'split4_test_score': array([0.86956522]),
'mean_test_score': array([nan]),
'std_test_score': array([nan]),
'rank_test_score': array([1], dtype=int32)}
but when i fit model without grid search , same model ,defaullt parameters for stratified cross validation
model=SVC_model.fit(X_train,y_train)
## TO calculate stratified k fold cross validation with svm
from sklearn.model_selection import StratifiedKFold, cross_val_score
# Create a StratifiedKFold object for cross-validation
stratified_kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
# 'n_splits' is the number of folds you want, you can adjust it as needed.
# Perform cross-validation and get the validation scores
scores = cross_val_score(model, X, y, cv=stratified_kfold, scoring='accuracy')
# 'X' is your feature matrix, 'y' is your target variable, and 'scoring' can be set to the metric you want to evaluate, e.g., 'accuracy', 'f1', etc.
# Print the cross-validation scores
print("Cross-validation scores:", scores)
print("Mean accuracy:", scores.mean())
i am gettin these results
Cross-validation scores: [0.90163934 0.83606557 0.68333333 0.8 0.83333333]
Mean accuracy: 0.8108743169398906