I have a weird problem which only occurs since today on my github workflow. These are relevant commands.
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip3 install mmengine==0.6.0 mmcv==2.0.0rc3 mmdet==3.0.0rc5 mmaction2==1.0rc3
The former succeeded. The latter stops with following error:
Collecting mmengine==0.6.0
  Using cached mmengine-0.6.0-py3-none-any.whl (360 kB)
Collecting mmcv==2.0.0rc3
  Using cached mmcv-2.0.0rc3.tar.gz (424 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-uml22xq3/mmcv_89a43e000b91495e88399ffe3c493514/setup.py", line 329, in <module>
          ext_modules=get_extensions(),
                      ^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-uml22xq3/mmcv_89a43e000b91495e88399ffe3c493514/setup.py", line 290, in get_extensions
          ext_ops = extension(
                    ^^^^^^^^^^
        File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1048, in CUDAExtension
          library_dirs += library_paths(cuda=True)
                          ^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1179, in library_paths
          if (not os.path.exists(_join_cuda_home(lib_dir)) and
                                 ^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/github/.pyenv/versions/miniconda3-3.10-22.11.1-1/envs/heavi-analytic/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 2223, in _join_cuda_home
          raise EnvironmentError('CUDA_HOME environment variable is not set. '
      OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
Any idea?
UPDATE 1: So it turns out that pytorch version installed is 2.0.0 which is not desirable.
 
                        
It turns out that as torch 2 was released on March 15 yesterday, the continuous build automatically gets the latest version of torch.
This hardcoded torch version fix everything:
It installs torch 1.13 with cuda 11.7.
EDIT 1:
Sometimes pip3 does not succeed. Use conda instead.