I am interested in implementing a somewhat complex custom tensorflow operation. Let's say (for the purpose of this question) that the operation is similar to performing convolution with stride=2, dilation=2, and padding. Now, to use this op during the training loop, I also have to implement a gradient op.

But the problem is: I do not know how to represent the gradient for this op as a closed form formula. And I can imagine that this would be an issue for most non-trivial custom operations, because calculating the gradient is a non-trivial task (atleast for me).

Is it possible to implement the gradient op by using the fundamental definition of partial derivatives?

f'(x) = [f(x+dx) - f(x)]/f(x)

{for every input parameter x to the custom tensorflow operation)

Seems that this is a more generic approach (compared to the calculation of gradient using the closed form formula).

I am wondering why this sort of an implementation is not available on the internet anywhere? The possible reasons I can think of are:

  1. It might be computationally expensive to calculate the partial derivative for each input variable to the custom op.

  2. There any numerical convergence issues that could occur in this approach?

  • Any insights would be helpful!
0

There are 0 best solutions below