ORIGINAL CODE
def get_theta(self):
theta = self.parameters().detach().cpu
return theta
def get_norm2Gradient(self):
theta = get_theta(self)
loss = loss(self, xb, yb)
grad = loss.backward()
for param in theta:
grad.append(param.grad)
#computes gradient norm
norm2Gradient = torch.linalg.norm(grad)
return norm2Gradient
def fit(self, loader, epochs = 2000):
norm2Gradient = 1
while norm2Gradient <10e-3 and epochs <2000:
for _, batch in enumerate(loader):
x, y = batch['x'], batch['y']
#computes f.cross_entropy loss of (xb,yb) on GPU
loss = self.loss(x,y)
#print("loss:", loss)
loss = loss.mean()
#print("loss mean:", loss)
#clears out old gradients
self.optimizer.zero_grad()
#calculates new gradients
grad = loss.backward()
print("grad:",grad)
#takes one step along new gradients to decrease the loss
self.optimizer.step()
#captures new parameters
theta = self.parameters()
print("theta:",theta)
#collects gradient along new parameters
for param in theta:
grad.append(param.grad)
#computes gradient norm
norm2Gradient = torch.linalg.norm(grad)
return grad
CURRENT QUESTION and CODE (corrected per Karl's 3/2/2024 feedback)
I am trying to extract values that are computed during the fit function of PyTorch: the parameters themselves; and an L-2 norm of the gradient. Here is my code for these objectives.
def get_theta(self):
theta = self.parameters().detach().cpu
return theta
def fit(self, loader, epochs = 2000):
norm2Gradient = 1
while norm2Gradient >10e-3 and epochs <2000:
for _, batch in enumerate(loader):
x, y = batch['x'], batch['y']
#computes f.cross_entropy loss of (xb,yb) on GPU
loss = self.loss(x,y)
#print("loss:", loss)
loss = loss.mean()
#print("loss mean:", loss)
#clears out old gradients
self.optimizer.zero_grad()
#calculates new gradients
grad = loss.backward()
print("grad:",grad)
#takes one step along new gradients to decrease the loss
self.optimizer.step()
#captures new parameters
theta = self.parameters()
print("theta:",theta)
#collects gradient along new parameters
for param in theta:
grad.append(param.grad)
#computes gradient norm
norm2Gradient = torch.linalg.norm(grad)
sumNorm2Gradient += norm2Gradient.detach().cpu
return sumNorm2Gradient
Here is the reoccurring error message
AttributeError: 'NoneType' object has no attribute 'append'
It occurs at this line of the code.
grad.append(param.grad)
I printed the grad variable out, and it says "None."
My intention was to capture the gradient with the following line of code.
grad = loss.backward()
What's the better way to do it that gets at the gradient being computed during the fit function?
Similarly: Does this line capture the parameters?
theta = self.parameters()
Thank you!
This is due to an error in your
whileconditionSince
norm2Gradient = 1, the conditionnorm2Gradient <10e-3evals toFalseand thewhileloop never executes. The function then tries toreturn gradwhen thegradvariable has not been assigned. This triggers the error.That said, there is another issue with your approach. Your gradient tensors will be of different shapes, so you can't string them together in a list and compute the L2 norm of them. You probably want to compute the L2 norm of each gradient tensor individually, then compute the average.