Computing the jacobian of a function f : R^d -> R^d is not too hard:
def jacobian(y, x):
k, d = x.shape
jacobian = list()
for i in range(d):
v = torch.zeros_like(y)
v[:, i] = 1.
dy_dx = torch.autograd.grad(y, x, grad_outputs = v, retain_graph = True, create_graph = True, allow_unused = True)[0] # shape [k, d]
jacobian.append(dy_dx)
jacobian = torch.stack(jacobian, dim = 1).requires_grad_()
return jacobian
Above, jacobian is invoked with y = f(x). However, now I have a function g = g(t, x), where t is a torch.tensor of shape k and x is a torch.tensor of shape (k, d1, d2, d3). The result of g is again a torch.tensor of shape (k, d1, d2, d3)
I've tried to use my already existing jacobian function. What I did was
y = g(t, x)
x = x.flatten(1)
y = y.flatten(1)
jacobian(y, x)
The problem is that all the time dy_dx is None. The only explanation I have for this is that most probably the dependency graph is broken after the flatten(1) call.
So, what can I do here? I should remark that what I actually want to compute is the divergence. That is, the trace of the jacobian. If there is a more performant solution for that specific case available, I'd be interested in that one.
You are correct, you are passing
x.flatten(1)as an input even thoughy- let aloney.flatten(1)- was computed fromx, notx.flatten(1). Instead, you could avoid the flattening with something like this:So after calling the function, you can flatten
d1,d2, andd3together. Here is a minimal example:From here, I guess you can compute the trace you're looking for with
torch.trace.Keep in mind you can also use the builtin
jacobianfunction:However, you will need to reshape the result: