I am evaluating my segmentation model performance using Dice and Jaccard, however the calculated mean Jaccard is a lot smaller than the calculated mean Dice coefficient as shown here: Mean Dice and Jaccard
From my understanding of the equations of the two metrics, the values should not have an inverse relationship, but rather similar to each other. Which is not the case here.
Here is is how I am calculating the two metrics:
import numpy as np
import torch
from sklearn.metrics import jaccard_score
def dice_metrics(inputs, targets, smooth=1):
#flatten label and prediction tensors
inputs = inputs.view(-1)
targets = targets.view(-1)
intersection = (inputs * targets).sum()
dice = (2.*intersection + smooth)/(inputs.sum() + targets.sum() + smooth)
return dice
def testinge_loop (model, loader, path, device=torch.device('cuda')):
dice = []
jaccard = []
model.eval()
with torch.no_grad():
for x, y in loader:
# split data to image and mask
image = x
mask = y
image = image.to(device)
mask = mask.to(device)
# test_outputs = predicted mask
test_outputs = model(image)
test_outputs = torch.sigmoid(test_outputs)
dice_metric = dice_metrics(test_outputs, mask)
dice.append(dice_metric.cpu().numpy())
gt = mask.detach().cpu().numpy()
gt = gt.astype(np.uint8)
gt = gt.reshape(-1)
pm = test_outputs.detach().cpu().numpy()
pm = pm.astype(np.uint8)
pm = pm.reshape(-1)
jaccard.append(jaccard_score(gt, pm))
return dice, jaccard
I thought I was computing the two metrics wrong, however, when I evaluated a different model, I was able to get the expected relationship. Mean Dice and Jaccard of different model
Am I computing the metrics wrong?
Updates: Instead of using the jaccard_score function from sklearn.metrics, I developed my own code that calculates the jaccard index by modifying the function for dice metrics.
def jaccard_metrics(inputs, targets, smooth=1):
#flatten label and prediction tensors
inputs = inputs.view(-1)
targets = targets.view(-1)
intersection = (inputs * targets).sum()
jaccard = (intersection + smooth)/(inputs.sum() + targets.sum() - intersection + smooth)
return jaccard
I'm not sure why this method solves the problem...and why the sklearn.metrics.jaccard_score function doesn't really work.
I think there are several issues here.
First, when you convert your mask to uint8, gt will only contain zeros because mask it between 0 and 0.9804, according to the info you provided
Second, 1 for
smoothis way, way to high of a value. The purpose of this is to prevent division by zero without affecting your calculations, but a value of 1 can easily be much higher thanintersection. A more appropriate value would be 1e-8.Instead of simply converting
maskandtest_outputstouint8, you should first apply a binary threshold, and only then convert. Something like this:And similarly for
pm.Also, I'm pretty sure your implementation of jaccard score does not actually solve the issue, it just hides the values lost by incorrect conversion to
uint8by using too high of a value forsmooth.