I am developing a Python code using the Pytorch library to jointly train a generative model and a classification model, which grad I would use to generate conditioned samples.
I compute the classification model loss (cls_loss) and the generative model loss (score_loss) and then sum them, obtaining a total_loss. When I call total_loss.backward(), I get the following error: "RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!".
I already asserted that both losses, models and inputs are on the same device, cuda in this case. I also tried calling .backward() separately on the two losses, and I noticed that the error is triggered on the second .backward() calling, no matter if it is related to the *cls_loss *or the *score_loss *(I tried both ways, calling cls_loss.backward() and then score_loss.backward() and then the other way round).
I also want to specify that I am using two **different **losses and two **different **optimizers for training the two models. In addition, I already asserted that both losses are not NaN and they have the same size.
Does anyone have any idea about what the problem could be? Thanks!