I am using Keras Tuner to hypertune my model. I am setting the parameter value “validation_split = 0.2” in the search() call. Does it still make sense to pass “shuffle = True” or is that redundant / counter-productive?
tuner = RandomSearch(
hypermodel = build_model,
objective = kt.Objective("val_loss", direction = "min"),
max_trials = 100,
executions_per_trial = 2,
directory = "V5",
project_name = "case8",
seed = RANDOM_SEED,
overwrite = True
)
tuner.search(
x = x_train_new,
y = y_train.values,
batch_size = 1024,
epochs = 100,
validation_split = 0.2,
shuffle = True,
callbacks = [ model_early_stopping]
)
validation_split = 0.2 This will split data into training data =0.8 and validation data=0.2
By default, Keras tuner shuffles the data, hence no need to explicitly mention it.
For time-series data, the tuner should not shuffle the data, in this case, keep its value to false.
For the generator, the tuner ignores the value of the shuffle parameter even if we pass it.
You can find more details over here https://github.com/tensorflow/tensorflow/blob/80117da12365720167632761a61e0e32e4db2dcc/tensorflow/python/keras/engine/training.py#L1003