I am working on a multi-label classification task on tabular data. I have already used lightgbm and xgboost, and wanted to try a keras sequential model as well.
To begin with, I have created a model with 4 layers: 1 input, 2 hidden and 1 output.
model = keras.Sequential([
tf.keras.Input(shape=(input_shape,)),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(output_shape, activation='sigmoid')
])
I have used adam for optimizer, binary cross entropy for loss and binary accuracy for metric.
model.compile(
optimizer = tf.keras.optimizers.Adam(),
loss = tf.losses.BinaryCrossentropy(),
metrics = [tf.keras.metrics.BinaryAccuracy()]
)
I have also scaled my data using StandardScaler. In addition, I have added an early stopping callback with a patience value of 3.
The model stops training at a very low epoch value (around 10-15), and the binary accuracy for both train and test data are extremely high for every single epoch, including the first one.
When I use model.predict(), the output is the same value for each label for every single row in my test set. I have double checked my test set and all the rows have different values.
At first, I thought the problem had to do with my loss function. The model seems to learn some values for any kind of input that give minimum binary cross entropy in total. I have tried different loss functions, but that did not work. If I knew what the problem was, I would have searched for it but I cannot diagnose what the issue is. I wanted to see if anyone else had seen something similar before.
In addition, I would like to ask for any example on keras multi-label classification notebooks on tabular data if you know any as good reference. Every keras multi-label classification example notebook I found was on either image or text classification. Is it because other algorithms like xgboost and lightgbm are so much better than deep learning when it comes to tabular data classification?