So I want to try out an adaptive activation function for my neural network. This means I want to have a custom loss that is similar to a standard one (like tanh or relu), however I want to add some trainable parameters.
Currently, I am trying to add this trainable parameter by creating the activation function as a custom layer:
class AdaptiveActivation(keras.layers.Layer):
> """
Adaptive activation function that is changed in training process.
> """
def __init__(self, act="tanh"):
super(AdaptiveActivation, self).__init__()
self.a = tf.Variable(0.1, dtype=tf.float32, trainable=True)
self.n = tf.constant(10.0, dtype=tf.float32)
self.act = act
def call(self, x):
if self.act == "tanh":
return keras.activations.tanh(self.a*self.n*x)
elif self.act == "relu":
return keras.activations.relu(self.a*self.n*x)
However - if I understood some test outputs correctly - this means every time I call the activation function, there will be a unique parameter a. This means for every hidden layer, I get a different a. What I want, is one single a for all my activation functions. So instead of say 9 different values for a per epoch, just always one a that can change between epochs.
Furthermore, is there an easy way to obtain the a from this layer for output during training?
ok the solution was stupidly easy, I can just pass a trainable tensorflow variable to the layer from outside and assign it to the self.a there.
This also solves the "issue" of tracking it.
It does feel very unnecessary though, why couldn't I just have done this without having to implement a new layer first.