I am quite beginner in machine learning. I am trying to conduct a t-test for the difference of means to assess which algorithm achieves higher F1 score. I have results of both algorithms Such as F1_score for algorithm A is 0.63 and other 0.89 for algorithm B.
I have applied the following code but I am unable to sort it out and did not understand the error very well. How can I compare two algorithms? and get the performance results from hypothesis testing?
X = data_frame.iloc[:, 3:]
y = data_frame.iloc[:,2:-7]
from mlxtend.evaluate import paired_ttest_5x2cv
t, p = paired_ttest_5x2cv(estimator1=f1_score_Algo_A, estimator2=f1_score_Algo_B, X=X, y=y)
alpha = 0.05
print('t statistic: %.3f' % t)
print('aplha ', alpha)
print('p value: %.3f' % p)
if p > alpha:
print("Fail to reject null hypotesis")
else:
print("Reject null hypotesis")
from mlxtend.evaluate import paired_ttest_5x2cv
----> t, p = paired_ttest_5x2cv(estimator1=lr_f1,estimator2=dt_f1, X=X, y=y, random_seed=1)
alpha = 0.05
print('t statistic: %.3f' % t)
AttributeError: 'numpy.float64' object has no attribute '_estimator_type'
Expected outcome will be which algorithm has performed well on the basis of F1_Score.
The function
paired_ttest_5x2cv()expects the trained models (to be compared) as inputs, not theF1 scores.Here is the reproduced error with
irisdataset (try with your dataset) and couple of models (a LR and a DT model, try with your own models):Now, try with the trained models instead, it will work:
Note that the
F1 scorescomputed above are not used while the pairedt-testsare executed (here the scores are computed on the held-out test dataset just to have an idea about the models' performances), the actual scores on theCVsplits are computed with the scoring function while thet-testsare done.