Suppose I have
- a data matrix
X(numpy ndarray, all features are numeric) - labels
y(numpy array of string).
I want to apply SimpleImputer(strategy='mean') and StandardScaler() to X, and OrdinalEncoder() to y. After receiving the transformed data, I want to use LogisticRegression() estimator for RFE() to select 2 most important features from the data.
Is there a nice way to create a single Pipeline() to perform this task?
If not then how can this be done?
I want to use the Pipeline() instance like this :
pipe = Pipeline(<some code>)
important_matrix = pipe.fit_transform(X, y)