Confusion about the code for choosing "stumps" in Adaboost algorithm

47 Views Asked by Richard At 05 August 2023 at 15:55

This question refers to the following step in the classical procedure of Adaboost classification.

Suppose that we assign an array W and generate training points x and y (with y only taking values -1, 1) as follows:

W = [0.05, 0.032732683535398856, 0.05, 0.05, 0.032732683535398856, 
0.05, 0.05, 0.05, 0.032732683535398856, 0.05, 
0.05, 0.05, 0.05, 0.05, 0.05, 
0.05, 0.05, 0.032732683535398856, 0.032732683535398856, 0.032732683535398856]

from sklearn.datasets import make_blobs
x,y = make_blobs(n_samples = 20, n_features = 5, centers = 2, cluster_std = 20.0, random_state = 100) 
y[y==0] = -1

Then my textbook uses the following code A to generate c_b.

from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(max_depth=1)
clf.fit(x, y, sample_weight = W)  # Here clf is the weak classifier c_b. 
training_pred = clf.predict(x)
print(training_pred)

However, the following code B based on the definition of c_b gives a different result:

 import numpy as np
 from sklearn.tree import DecisionTreeClassifier

 error_rate = 100000

 for k in range(5):

        clf = DecisionTreeClassifier(max_depth=1)
        clf.fit(x[:,[k]], y)

        local_training_pred = clf.predict(x[:,[k]])

        local_error_rate = 0

        for i in range(len(x)):
            if (local_training_pred[i] != y[i]):
                local_error_rate += (W[i])/np.sum(W)

        if local_error_rate < error_rate:
            error_rate = local_error_rate
            training_pred = local_training_pred

print(training_pred)

Here the code compares the error rate of each stump; selects the one with the lowest error rate and then computes the prediction of the training set x under that stump.

Nonetheless, Codes A and B do not return the same result for our choice of W. Does anyone know the reason behind this? Have I actually mistaken the definition of stumps?

Original Q&A

Confusion about the code for choosing "stumps" in Adaboost algorithm

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in MACHINE-LEARNING

Related Questions in SCIKIT-LEARN

Related Questions in DECISION-TREE

Related Questions in ADABOOST

Trending Questions

Popular # Hahtags

Popular Questions