Why is my keras custom loss function not working properly when training my model?

19 Views Asked by At

So im trying to implement a weighted loss function, i took two different approaches (first using python functions, then using keras.losses.loss class inheritance). They both yield the same loss when i take the predictions of my model and compute the loss between predictions and ground truths. However, when i pass the loss functions as a parameter in model.compile and then i use model.fit, they get vastly different results. The first approach gets great results, the second one gets terrible results. Any ideas why that could be?

First Approach (just using python functions):

    def get_weighted_loss(pos_weights, neg_weights, epsilon=1e-7):
        def weighted_loss(y_true, y_pred):
            # computes the weighted binary cross entropy loss
            loss = -1 * keras.src.backend.numpy.mean(
                pos_weights * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + epsilon) +
                neg_weights * (1 - y_true[:]) * keras.src.backend.numpy.log(
                    1 - y_pred[:] + epsilon))
            return loss
        return weighted_loss

Implementation:

    def train_model(self, epochs=10):
        # calculates weights 
        freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
        # Compiles model for training purposes
        self.model.compile(optimizer=keras.optimizers.AdamW(),
                           loss=self.get_weighted_loss(freq_neg, freq_pos),
                           metrics=['accuracy'])
        # Trains or fits the models training data
        history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
                                 epochs=epochs, batch_size=32)
        return history

Result: Gets great accuracy at around 3 training epochs, 0.95 - 0.98 ish

Second Approach (using keras loss class inheritance):

class SingleClassWeightedLoss(keras.losses.Loss):
    def __init__(self, pos_weight, neg_weight, epsilon=1e-7):
        super(SingleClassWeightedLoss, self).__init__()
        self.name = 'WeightedLoss'
        self.neg_weight = neg_weight
        self.pos_weight = pos_weight
        self.epsilon = epsilon

    def call(self, y_true, y_pred):
        # computes the weighted binary cross entropy loss
        loss = -1 * keras.src.backend.numpy.mean(
            self.pos_weight * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + self.epsilon) +
            self.neg_weight * (1 - y_true[:]) * keras.src.backend.numpy.log(
                1 - y_pred[:] + self.epsilon))
        return loss

Implementation:

    def train_model(self, epochs=10):
        freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
        # Compiles model for training purposes
        self.model.compile(optimizer=keras.optimizers.AdamW(),
                           loss=lsu.SingleClassWeightedLoss(freq_neg, freq_pos),
                           metrics=['accuracy'])
        # Trains or fits the models training data
        history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
                                 epochs=epochs, batch_size=32)
        return history

Result: terrible accuracy, around 0.40-0.45 ish after 10 epochs

Paradox (i compute the loss using both approaches):

my_preds = my_nn.batch_predict(my_nn.x_train, normalize=True)
my_ground_truth = my_nn.y_train
my_loss_fn = my_nn.get_weighted_loss(my_neg_freq, my_pos_freq)
loss = my_loss_fn(my_ground_truth, my_preds)
print(f"First approach loss: {loss}")
class_loss_fn = lsu.SingleClassWeightedLoss(my_neg_freq, my_pos_freq)
loss = class_loss_fn(my_ground_truth, my_preds)
print(f"Second approach loss: {loss}")

both loss functions yield the same result: enter image description here

why is the first approach giving me good results and the second approach giving me terrible results?

0

There are 0 best solutions below