Computing Fisher Information in pytorch models

90 Views Asked by At

I want to implement a function that computes the fisher information of each parameter of a yolov5. So, I took a pre-trained model and computed the fisher info by iterating through a batch of data and calculating the required information. I loaded the model and created a new loss function. This is the code:

def compute_fisher_information_detection(model, dataloader, device):

    model.train()  # Set model to training mode 
    for param in model.parameters():
      param.requires_grad = True

    optimizer = optim.SGD(model.parameters(), lr=1e-2)
    total_gradients = {name: 0 for name, param in model.named_parameters() if param.requires_grad}
    total_fisher_information = {name: 0 for name, layer in model.named_children() if hasattr(layer, 'weight')}

    for batch in dataloader:
      inputs = batch['img'].to(device)

      targets = []
      for label_list in batch['label']:
        label = [float(value[0]) for value in label_list]
        targets.append(torch.tensor(label).to(device))

      optimizer.zero_grad()
      outputs = model(inputs, augment=False)[0]
      conf_thres = 0.2
      iou_thres = 0.6
      pred = non_max_suppression(outputs, conf_thres, iou_thres)
      if len(pred[0]) == 0:
        continue

      boxes = []
      scores = []
      for i, det in enumerate(pred):
        if det is not None and len(det):
          for *xyxy, conf, cls in det:
            boxes.append([int(xyxy[0]), int(xyxy[1]), int(xyxy[2]), int(xyxy[3])])
            scores.append(cls)

      boxes = np.array(boxes)
      boxes[:, 2] -= boxes[:, 0]
      boxes[:, 3] -= boxes[:, 1]

      h_img, w_img = inputs.shape[2], inputs.shape[3]

      boxes_tensor = torch.tensor(boxes, dtype=torch.float32, requires_grad=True)

      coords = torch.zeros((len(boxes), 5), dtype=torch.float32)
      coords[:, 0] = torch.tensor(scores)
      coords[:, 1] = (boxes_tensor[:, 0] + boxes_tensor[:, 2]) / (2.0 * w_img)
      coords[:, 2] = (boxes_tensor[:, 1] + boxes_tensor[:, 3]) / (2.0 * h_img)
      coords[:, 3] = boxes_tensor[:, 2] / w_img
      coords[:, 4] = boxes_tensor[:, 3] / h_img
      coords = coords.to(device)
      targets = torch.tensor(targets[0], dtype=torch.float32, requires_grad=True)
      pred_conf, pred_boxes = coords[:, 0], coords[:, 1:]
      target_conf, target_boxes = targets[0], targets[1:]
      box_loss = F.smooth_l1_loss(pred_boxes, target_boxes)

      box_loss.backward() # Compute the loss and its gradients
      optimizer.step() # Adjust learning weights

    # Accumulate gradients for each model parameter
    for name, param in model.named_parameters():
      if param.requires_grad:
        if param.grad is None:
          print('Grad does not exist)
        total_gradients[name] += param.grad.data.clone()


    return 1

I get "Grad does not exist", so the gradient is not being computed.

Basically, if the model is being trained and the parameters are being updated, the gradient is computed, how could I save it, I tried to save it but it is not working.

1

There are 1 best solutions below

0
Karl On

You don't have a gradient chain between your model and your box_loss.

The model outputs pred. pred is the variable that allows you to backprop back into your weights.

When you create bounding boxes (boxes.append([int(xyxy[0]), int(xyxy[1]), int(xyxy[2]), int(xyxy[3])])), you break the gradient chain and can no longer backprop.

I see the model output cls (which might have a gradient chain, I can't tell) is added to scores. But then you add the scores terms to coords (coords[:, 0] = torch.tensor(scores)) which also breaks the gradient chain.

As a result, the pred_boxes tensor you pass to your loss function has no way of backproping into the model.