RuntimeError: expected scalar type Float but found Half

924 Views Asked by At

I'm trying to run the quantization-aware-training (Eager Mode Static Quantization) on CUDA device in pytorch.

Iam facing the below error:

RuntimeError: expected scalar type Float but found Half.

The quantization-aware-training is working fine on CPU device. But while running on the GPU, it takes the input device type as CUDA, runs the model training using torch.cuda.amp.autocast() and torch.cuda.amp.GradScaler(enabled=True). When running the training in this setting I'm facing the above-mentioned error.

I have tried the following based on suggestions from https://github.com/NVIDIA/apex/issues/965,

  1. Convert all the model parameters to float32
  2. Replace x = conv(x) to x=conv(x.float()) But none of it seems to work to resolve the error.

I also tried disabling AMP by setting with torch.cuda.amp.autocast(False): This gets around the previous issue but ends up with a different RuntimeError: Unsupported qscheme: per_channel_affine.

Any pointers here would be of great help!

0

There are 0 best solutions below