Sorry in advance for my question, I'm still fairly new to Deep Learning and Pytorch.
I'm looking for strategies to increase inference speed, and I came upon Pruning. If I understood correctly, this is an iterative process:
- Train model
- Prune
- Set non pruned weights of original model back to the initial weight values (rewinding)
- Train again this model
- Repeat 2 - 4
Can anyone give an example, or guide me on how to perform these steps, in particular rewinding, in Pytorch?
I also have a second question: How can only setting some weights to 0 increase inference speed without modifying the model architecture itself?
I've been searching around but I can't find examples on how to perform rewinding.
Note that currently I have no minimal code example to share, since I'm still trying to figure out how to do this.
Thank you!