I'm trying to replicate the behaviour of "l1" objective in LGBMRegressor using a custom objective function.
I define the L1 loss function and compare the regression with the "l1" objective. I expect the 2 results to be almost identical which they are not.
What did I miss ?
Here is the code to illustrate my point.
import lightgbm as lgb
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
def l1_loss(y_true, y_pred):
grad = np.sign(y_pred - y_true)
hess = np.ones_like(y_pred)
return grad, hess
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
reg1 = lbg.LGBMRegressor(objective="l1").fit(X_train, y_train)
yh_test1 = reg1.predict(X_test)
reg2 = lbg.LGBMRegressor(objective=l1_loss).fit(X_train, y_train)
yh_test2 = reg2.predict(X_test)
np.abs((yh_test1 - yh_test2) / yh_test1).mean() # should be close to zero but isn't
np.abs((yh_test1 - yh_test2) / yh_test1).mean()I am unsure as to why you chose this calculation to assess the accuracy performance of the two models, but the results are very inconsistent when running across several iterations, i.e. there are times the value will be close to 0, other times it will be over 1000.
A better approach to compare the accuracy of the two regressions on the test set would be to compare the root mean squared error of the predictions to the mean of the test set. In this regard, you should firstly import math and mean_squared_error as below.
Then, it is possible to compare the size of the root mean squared error across both regressions to that of the test set.
The mean of y_test is obtained as below:
Then, we can see that the root mean squared error is almost identical when comparing the predictions of the two regressions to the test set.
When assessing model performance in this manner, the lower the RMSE relative to the mean of the test set - the better. In this case, RMSE shows almost identical performance which means the performance of the two models are virtually identical.