I want to implement a function that computes the fisher information of each parameter of a yolov5. So, I took a pre-trained model and computed the fisher info by iterating through a batch of data and calculating the required information. I loaded the model and created a new loss function. This is the code:
def compute_fisher_information_detection(model, dataloader, device):
model.train() # Set model to training mode
for param in model.parameters():
param.requires_grad = True
optimizer = optim.SGD(model.parameters(), lr=1e-2)
total_gradients = {name: 0 for name, param in model.named_parameters() if param.requires_grad}
total_fisher_information = {name: 0 for name, layer in model.named_children() if hasattr(layer, 'weight')}
for batch in dataloader:
inputs = batch['img'].to(device)
targets = []
for label_list in batch['label']:
label = [float(value[0]) for value in label_list]
targets.append(torch.tensor(label).to(device))
optimizer.zero_grad()
outputs = model(inputs, augment=False)[0]
conf_thres = 0.2
iou_thres = 0.6
pred = non_max_suppression(outputs, conf_thres, iou_thres)
if len(pred[0]) == 0:
continue
boxes = []
scores = []
for i, det in enumerate(pred):
if det is not None and len(det):
for *xyxy, conf, cls in det:
boxes.append([int(xyxy[0]), int(xyxy[1]), int(xyxy[2]), int(xyxy[3])])
scores.append(cls)
boxes = np.array(boxes)
boxes[:, 2] -= boxes[:, 0]
boxes[:, 3] -= boxes[:, 1]
h_img, w_img = inputs.shape[2], inputs.shape[3]
boxes_tensor = torch.tensor(boxes, dtype=torch.float32, requires_grad=True)
coords = torch.zeros((len(boxes), 5), dtype=torch.float32)
coords[:, 0] = torch.tensor(scores)
coords[:, 1] = (boxes_tensor[:, 0] + boxes_tensor[:, 2]) / (2.0 * w_img)
coords[:, 2] = (boxes_tensor[:, 1] + boxes_tensor[:, 3]) / (2.0 * h_img)
coords[:, 3] = boxes_tensor[:, 2] / w_img
coords[:, 4] = boxes_tensor[:, 3] / h_img
coords = coords.to(device)
targets = torch.tensor(targets[0], dtype=torch.float32, requires_grad=True)
pred_conf, pred_boxes = coords[:, 0], coords[:, 1:]
target_conf, target_boxes = targets[0], targets[1:]
box_loss = F.smooth_l1_loss(pred_boxes, target_boxes)
box_loss.backward() # Compute the loss and its gradients
optimizer.step() # Adjust learning weights
# Accumulate gradients for each model parameter
for name, param in model.named_parameters():
if param.requires_grad:
if param.grad is None:
print('Grad does not exist)
total_gradients[name] += param.grad.data.clone()
return 1
I get "Grad does not exist", so the gradient is not being computed.
Basically, if the model is being trained and the parameters are being updated, the gradient is computed, how could I save it, I tried to save it but it is not working.
You don't have a gradient chain between your model and your
box_loss.The model outputs
pred.predis the variable that allows you to backprop back into your weights.When you create bounding boxes (
boxes.append([int(xyxy[0]), int(xyxy[1]), int(xyxy[2]), int(xyxy[3])])), you break the gradient chain and can no longer backprop.I see the model output
cls(which might have a gradient chain, I can't tell) is added toscores. But then you add thescoresterms tocoords(coords[:, 0] = torch.tensor(scores)) which also breaks the gradient chain.As a result, the
pred_boxestensor you pass to your loss function has no way of backproping into the model.