Using fit_transform to the train set also applies fit_transform to validation set?

24 Views Asked by JGM At 13 January 2024 at 17:30

Doesn't splitting the whole dataset into training set and test set result to the validation set also undergoing whatever preprocessing steps the training set went through? My understanding is that, ideally continuous features should be scaled like:

#standardization of continuous features
num_ct = ColumnTransformer([('standardize', StandardScaler(), numerical)])
X_train = num_ct.fit_transform(X_train)
X_val = num_ct.transform(X_val)
X_test = num_ct.transform(X_test)

But suppose I did:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state = random_state)

#standardization of continuous features

num_ct = ColumnTransformer([('standardize', StandardScaler(), numerical)])
X_train = num_ct.fit_transform(X_train)
X_test = num_ct.transform(X_test)

for a neural network and used Skorch like so:

net = NeuralNet(
    module=MyDNN,
    ...,
    train_split = 0.2,
)

Doesn't this mean that I included the 20% validation set to the fit_transform I did earlier?

Original Q&A

Using fit_transform to the train set also applies fit_transform to validation set?

There are 0 best solutions below

Related Questions in NEURAL-NETWORK

Related Questions in DATA-PREPROCESSING

Trending Questions

Popular # Hahtags

Popular Questions