I do not have access to V100 GPU locally. So I cannot build/push the image that is then needed to run in ACI container.
How can I build the image, that is to be pushed to ACR and then used by GPU-enabled container? I was thinking of using the ACI itself to build the image, but calling docker inside a container is not easy.
One solution I found (for my Pytorch extension case) is to build the docker without GPU. I match the cuda version and hard code the architecture.
Should make sure
troch.version.cudareturns the version string, otherwise the CPU version is picked up during build.