Problem
I am calling the fit_transform() and transform() methods on a Pipeline object, but Python is raising an AttributeError whenever I try to do so. Here is what I'm trying to run, with imports. (Note: train/test splitting has been done already)
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
pipe = Pipeline([('mean_impute', SimpleImputer()),
('norm', StandardScaler()),
('sklearn_lm', LinearRegression())])
pipe.fit_transform(x_train, y_train) #<-- error here
x_transform = pipe.transform(x_test) #<-- and here if previous line is absent
The text of the error is as follows:
AttributeError: This 'Pipeline' has no attribute 'fit_transform'
What went wrong? I'm sure it's something simple.
Things I have tried:
- Looked over the documentation for sci-kit learn to confirm that these methods exist for the Pipeline object in sklearn
- Checked the sizes of
x_trainandy_trainto make sure they were the same, and that they both had headers - Reinstalled
sci-kit learn
Documentation for
sklearn.pipeline.Pipeline.fit_transformstates that it's "[o]nly valid if the final estimator either implementsfit_transformorfitandtransform." Wording may be a bit ambiguous, but it means two possibilities: (i) final estimator implementsfit_transform, or (ii) final estimator implementsfitandtransform.Your final estimator is
sklearn.linear_model.LinearRegression, which implementsfit, but nottransform. This is why the error is raised.