So im trying to implement a weighted loss function, i took two different approaches (first using python functions, then using keras.losses.loss class inheritance). They both yield the same loss when i take the predictions of my model and compute the loss between predictions and ground truths. However, when i pass the loss functions as a parameter in model.compile and then i use model.fit, they get vastly different results. The first approach gets great results, the second one gets terrible results. Any ideas why that could be?
First Approach (just using python functions):
def get_weighted_loss(pos_weights, neg_weights, epsilon=1e-7):
def weighted_loss(y_true, y_pred):
# computes the weighted binary cross entropy loss
loss = -1 * keras.src.backend.numpy.mean(
pos_weights * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + epsilon) +
neg_weights * (1 - y_true[:]) * keras.src.backend.numpy.log(
1 - y_pred[:] + epsilon))
return loss
return weighted_loss
Implementation:
def train_model(self, epochs=10):
# calculates weights
freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
# Compiles model for training purposes
self.model.compile(optimizer=keras.optimizers.AdamW(),
loss=self.get_weighted_loss(freq_neg, freq_pos),
metrics=['accuracy'])
# Trains or fits the models training data
history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
epochs=epochs, batch_size=32)
return history
Result: Gets great accuracy at around 3 training epochs, 0.95 - 0.98 ish
Second Approach (using keras loss class inheritance):
class SingleClassWeightedLoss(keras.losses.Loss):
def __init__(self, pos_weight, neg_weight, epsilon=1e-7):
super(SingleClassWeightedLoss, self).__init__()
self.name = 'WeightedLoss'
self.neg_weight = neg_weight
self.pos_weight = pos_weight
self.epsilon = epsilon
def call(self, y_true, y_pred):
# computes the weighted binary cross entropy loss
loss = -1 * keras.src.backend.numpy.mean(
self.pos_weight * y_true[:] * keras.src.backend.numpy.log(y_pred[:] + self.epsilon) +
self.neg_weight * (1 - y_true[:]) * keras.src.backend.numpy.log(
1 - y_pred[:] + self.epsilon))
return loss
Implementation:
def train_model(self, epochs=10):
freq_pos, freq_neg = dsu.compute_class_frequency(self.y_train)
# Compiles model for training purposes
self.model.compile(optimizer=keras.optimizers.AdamW(),
loss=lsu.SingleClassWeightedLoss(freq_neg, freq_pos),
metrics=['accuracy'])
# Trains or fits the models training data
history = self.model.fit(self.x_train, self.y_train, validation_data=(self.x_valid, self.y_valid),
epochs=epochs, batch_size=32)
return history
Result: terrible accuracy, around 0.40-0.45 ish after 10 epochs
Paradox (i compute the loss using both approaches):
my_preds = my_nn.batch_predict(my_nn.x_train, normalize=True)
my_ground_truth = my_nn.y_train
my_loss_fn = my_nn.get_weighted_loss(my_neg_freq, my_pos_freq)
loss = my_loss_fn(my_ground_truth, my_preds)
print(f"First approach loss: {loss}")
class_loss_fn = lsu.SingleClassWeightedLoss(my_neg_freq, my_pos_freq)
loss = class_loss_fn(my_ground_truth, my_preds)
print(f"Second approach loss: {loss}")
both loss functions yield the same result: enter image description here
why is the first approach giving me good results and the second approach giving me terrible results?