I have been trying to implement a (heavily-inspired) Gradient-Accumulation wrapper for my Adam optimizer, as shown below:
class AccumOptimizer(tf.keras.optimizers.Optimizer):
def __init__(self, optimizer, steps_per_update=1, **kwargs):
super(AccumOptimizer, self).__init__(name="AccumOptimizer", **kwargs)
self.optimizer = optimizer
self.steps_per_update = steps_per_update
self.iterations = tf.Variable(0, dtype='int64', name='iterations')
self.cond = tf.equal(self.iterations % self.steps_per_update, 0)
self.lr = self.optimizer.learning_rate
self.optimizer.learning_rate = tf.cond(self.cond,
lambda: self.optimizer.learning_rate,
lambda: tf.constant(0, tf.float32))
...
But seem to butt into the following TypeError when I try to train:
TypeError: __array__() takes 1 positional argument but 2 were given
which points to the part where self.optimizer.learning_rate
is updated based on self.cond
, although I'm sure that tf.cond
should return a single value if the true/false function returns a singleton list.
I'm using Tensorflow 1.15.x (and constrained to this version, sadly).
Any ideas how this could be circumvented?