I would like to update learning rates corresponding to each weight matrix and each bias in pytorch during training. The answers here and here and many other answers I found online talk about doing this using the model's param_groups which to the best of my knowledge applies learning rates in groups, not layer weight/bias specific. I also want to update the learning rates during training, not pre-setting them with torch.optim.
Any help is appreciated.
Updates to model parameters are handled by an optimizer in PyTorch. When you define the optimizer you have the option of partitioning the model parameters into different groups, called param groups. Each param group can have different optimizer settings. For example one group of parameters could have learning rate of 0.1 and another could have learning rate of 0.01.
To do what you're asking, you can just make every parameter belong to a different param group. You'll need some way to keep track of which param group corresponds to which parameter. Once you've defined the optimizer with different groups you can update the learning rate whenever you want, including at training time.
For example, say we have the following simple linear model
and suppose we want learning rates for each trainable parameter initialized according to the following:
We can use this dictionary to define a different learning rate for each parameter when we initialize the optimizer.
Alternatively, we could omit the
'lr'entry and each param group would be initialized with the default learning rate (lr=10in this case).At training time if we wanted to update the learning rates we could do so by iterating over each of the
optimizer.param_groupsand updating the'lr'entry for each of them. For example, in the following simplified training loop, we update the learning rates before each step.which prints