I'm curious why a single-estimator Adaboost "ensemble", a single-estimator Gradient Boosted "ensemble" and a single decision tree give different values.
The code below compares three models, all using the same base estimator (regression tree with max_depth = 4 and loss based on mse.)
- The base estimate as a bare tree model
- A single-estimator Adaboost using the base estimator as a prototype
- A single-estimator GBR using the base estimator as a prototype
Extracting and inspecting the trees indicate they are very different, even though each should have been trained in the same fashion.
from sklearn.datasets import load_diabetes
from sklearn.ensemble import AdaBoostRegressor, GradientBoostingRegressor
from sklearn.tree import DecisionTreeRegressor, export_text
data = load_diabetes()
X = data['data']
y = data['target']
simple_model = DecisionTreeRegressor(max_depth=4)
prototype = DecisionTreeRegressor(max_depth=4)
simple_ada = AdaBoostRegressor(prototype, n_estimators=1)
simple_gbr = GradientBoostingRegressor(max_depth=4, n_estimators=1, criterion='mse')
simple_model.fit(X, y)
simple_ada.fit(X, y)
simple_gbr.fit(X, y)
ada_one = simple_ada.estimators_[0]
gbr_one = simple_gbr.estimators_[0][0]
print(export_text(simple_model))
print(export_text(ada_one))
print(export_text(gbr_one))
AdaBoostRegressorperforms weighted bootstrap sampling for each of its trees (unlikeAdaBoostClassifierwhich IIRC just fits the base classifier using sample weights): source. So there's no way to enforce that a single-tree AdaBoost regressor matches a single decision tree (without, I suppose, doing the bootstrap sampling manually and fitting the single decision tree).GradientBoostingRegressorhas an initial value for each sample to boost from:So the main difference between your tree and single-estimator-gbm is that the latter's leaf values are shifted by the average target value. Setting
init='zero'gets us much closer, but I do see some differences in chosen splits further down the tree. That is due to ties in optimal split values, and can be fixed by setting a commonrandom_statethroughout.