llama-cpp-python with metal acceleration on Apple silicon failing

63 Views Asked by At

I am following the instructions from the official documentation on how to install llama-cpp with GPU support in Apple silicon Mac.

Here is my Dockerfile:

FROM python:3.11-slim

WORKDIR /code

RUN pip uninstall llama-cpp-python -y

ENV CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1

RUN pip install -U llama-cpp-python --no-cache-dir

RUN pip install 'llama-cpp-python[server]'

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY . .

EXPOSE 8000

CMD ["panel", "serve", "--port", "8000", "chat.py", "--address", "0.0.0.0", "--allow-websocket-origin", "*"]

I am getting the following error:

[+] Building 6.1s (9/13)                                                               docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                   0.0s
 => => transferring dockerfile: 508B                                                                   0.0s
 => [internal] load metadata for docker.io/library/python:3.11-slim                                    0.9s
 => [auth] library/python:pull token for registry-1.docker.io                                          0.0s
 => [internal] load .dockerignore                                                                      0.0s
 => => transferring context: 2B                                                                        0.0s
 => [1/8] FROM docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf74f738d8  0.0s
 => => resolve docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf74f738d8  0.0s
 => [internal] load build context                                                                      0.0s
 => => transferring context: 2.19kB                                                                    0.0s
 => CACHED [2/8] WORKDIR /code                                                                         0.0s
 => CACHED [3/8] RUN pip uninstall llama-cpp-python -y                                                 0.0s
 => ERROR [4/8] RUN pip install -U llama-cpp-python --no-cache-dir                                     5.2s
------                                                                                                      
 > [4/8] RUN pip install -U llama-cpp-python --no-cache-dir:                                                
0.410 Collecting llama-cpp-python                                                                           
0.516   Downloading llama_cpp_python-0.2.57.tar.gz (36.9 MB)                                                
1.023      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.9/36.9 MB 99.0 MB/s eta 0:00:00                      
1.325   Installing build dependencies: started                                                              
2.285   Installing build dependencies: finished with status 'done'
2.285   Getting requirements to build wheel: started
2.336   Getting requirements to build wheel: finished with status 'done'
2.340   Installing backend dependencies: started
3.863   Installing backend dependencies: finished with status 'done'
3.864   Preparing metadata (pyproject.toml): started
3.955   Preparing metadata (pyproject.toml): finished with status 'done'
3.996 Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
4.014   Downloading typing_extensions-4.10.0-py3-none-any.whl.metadata (3.0 kB)
4.181 Collecting numpy>=1.20.0 (from llama-cpp-python)
4.201   Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
4.202      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.3/62.3 kB 96.9 MB/s eta 0:00:00
4.242 Collecting diskcache>=5.6.1 (from llama-cpp-python)
4.261   Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
4.298 Collecting jinja2>=2.11.3 (from llama-cpp-python)
4.317   Downloading Jinja2-3.1.3-py3-none-any.whl.metadata (3.3 kB)
4.372 Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
4.393   Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (3.0 kB)
4.416 Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
4.418    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB 412.0 MB/s eta 0:00:00
4.440 Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
4.444    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.2/133.2 kB 63.9 MB/s eta 0:00:00
4.472 Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB)
4.627    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.2/14.2 MB 94.7 MB/s eta 0:00:00
4.648 Downloading typing_extensions-4.10.0-py3-none-any.whl (33 kB)
4.671 Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (29 kB)
4.713 Building wheels for collected packages: llama-cpp-python
4.714   Building wheel for llama-cpp-python (pyproject.toml): started
4.910   Building wheel for llama-cpp-python (pyproject.toml): finished with status 'error'
4.912   error: subprocess-exited-with-error
4.912   
4.912   × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
4.912   │ exit code: 1
4.912   ╰─> [24 lines of output]
4.912       *** scikit-build-core 0.8.2 using CMake 3.29.0 (wheel)
4.912       *** Configuring CMake...
4.912       loading initial cache file /tmp/tmpk4ft3wii/build/CMakeInit.txt
4.912       -- The C compiler identification is unknown
4.912       -- The CXX compiler identification is unknown
4.912       CMake Error at CMakeLists.txt:3 (project):
4.912         No CMAKE_C_COMPILER could be found.
4.912       
4.912         Tell CMake where to find the compiler by setting either the environment
4.912         variable "CC" or the CMake cache entry CMAKE_C_COMPILER to the full path to
4.912         the compiler, or to the compiler name if it is in the PATH.
4.912       
4.912       
4.912       CMake Error at CMakeLists.txt:3 (project):
4.912         No CMAKE_CXX_COMPILER could be found.
4.912       
4.912         Tell CMake where to find the compiler by setting either the environment
4.912         variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
4.912         to the compiler, or to the compiler name if it is in the PATH.
4.912       
4.912       
4.912       -- Configuring incomplete, errors occurred!
4.912       
4.912       *** CMake configuration failed
4.912       [end of output]
4.912   
4.912   note: This error originates from a subprocess, and is likely not a problem with pip.
4.913   ERROR: Failed building wheel for llama-cpp-python
4.913 Failed to build llama-cpp-python
4.913 ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
------
Dockerfile:9
--------------------
   7 |     ENV CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1
   8 |     
   9 | >>> RUN pip install -U llama-cpp-python --no-cache-dir
  10 |     
  11 |     RUN pip install 'llama-cpp-python[server]'
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install -U llama-cpp-python --no-cache-dir" did not complete successfully: exit code: 1

I have tried different variations of the Dockerfile, but it always gives error on the same line, i.e., RUN pip install -U llama-cpp-python.

Why? And how do I fix it?


UPDATE: Based on the comments, I modified my Dockerfile like so:

FROM python:3.11-slim

RUN apt-get update && apt-get install -y --no-install-recommends gcc

WORKDIR /code

RUN pip uninstall llama-cpp-python -y

ENV CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1

RUN pip install -U llama-cpp-python --no-cache-dir

RUN pip install 'llama-cpp-python[server]'

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY . .

EXPOSE 8000

CMD ["panel", "serve", "--port", "8000", "chat.py", "--address", "0.0.0.0", "--allow-websocket-origin", "*"]

And I still am not able to install llama-cpp:

[+] Building 12.3s (10/14)                                                             docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                   0.0s
 => => transferring dockerfile: 578B                                                                   0.0s
 => [internal] load metadata for docker.io/library/python:3.11-slim                                    0.9s
 => [auth] library/python:pull token for registry-1.docker.io                                          0.0s
 => [internal] load .dockerignore                                                                      0.0s
 => => transferring context: 2B                                                                        0.0s
 => CACHED [1/9] FROM docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf7  0.0s
 => => resolve docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf74f738d8  0.0s
 => [internal] load build context                                                                      0.0s
 => => transferring context: 2.26kB                                                                    0.0s
 => [2/9] RUN apt-get update && apt-get install -y --no-install-recommends gcc                         5.2s
 => [3/9] WORKDIR /code                                                                                0.0s 
 => [4/9] RUN pip uninstall llama-cpp-python -y                                                        1.0s 
 => ERROR [5/9] RUN pip install -U llama-cpp-python --no-cache-dir                                     5.2s 
------                                                                                                      
 > [5/9] RUN pip install -U llama-cpp-python --no-cache-dir:                                                
0.500 Collecting llama-cpp-python                                                                           
0.600   Downloading llama_cpp_python-0.2.57.tar.gz (36.9 MB)                                                
1.093      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.9/36.9 MB 88.6 MB/s eta 0:00:00                      
1.373   Installing build dependencies: started                                                              
2.344   Installing build dependencies: finished with status 'done'
2.345   Getting requirements to build wheel: started
2.395   Getting requirements to build wheel: finished with status 'done'
2.399   Installing backend dependencies: started
3.882   Installing backend dependencies: finished with status 'done'
3.882   Preparing metadata (pyproject.toml): started
3.972   Preparing metadata (pyproject.toml): finished with status 'done'
4.014 Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
4.034   Downloading typing_extensions-4.10.0-py3-none-any.whl.metadata (3.0 kB)
4.200 Collecting numpy>=1.20.0 (from llama-cpp-python)
4.218   Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
4.219      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.3/62.3 kB 95.0 MB/s eta 0:00:00
4.258 Collecting diskcache>=5.6.1 (from llama-cpp-python)
4.282   Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
4.316 Collecting jinja2>=2.11.3 (from llama-cpp-python)
4.335   Downloading Jinja2-3.1.3-py3-none-any.whl.metadata (3.3 kB)
4.392 Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
4.410   Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (3.0 kB)
4.434 Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
4.435    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB 345.0 MB/s eta 0:00:00
4.454 Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
4.458    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.2/133.2 kB 103.0 MB/s eta 0:00:00
4.482 Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB)
4.620    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.2/14.2 MB 106.0 MB/s eta 0:00:00
4.641 Downloading typing_extensions-4.10.0-py3-none-any.whl (33 kB)
4.665 Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (29 kB)
4.703 Building wheels for collected packages: llama-cpp-python
4.704   Building wheel for llama-cpp-python (pyproject.toml): started
4.938   Building wheel for llama-cpp-python (pyproject.toml): finished with status 'error'
4.941   error: subprocess-exited-with-error
4.941   
4.941   × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
4.941   │ exit code: 1
4.941   ╰─> [50 lines of output]
4.941       *** scikit-build-core 0.8.2 using CMake 3.29.0 (wheel)
4.941       *** Configuring CMake...
4.941       loading initial cache file /tmp/tmp839s_tl2/build/CMakeInit.txt
4.941       -- The C compiler identification is GNU 12.2.0
4.941       -- The CXX compiler identification is unknown
4.941       -- Detecting C compiler ABI info
4.941       -- Detecting C compiler ABI info - failed
4.941       -- Check for working C compiler: /usr/bin/cc
4.941       -- Check for working C compiler: /usr/bin/cc - broken
4.941       CMake Error at /tmp/pip-build-env-qdg4zwxu/normal/lib/python3.11/site-packages/cmake/data/share/cmake-3.29/Modules/CMakeTestCCompiler.cmake:67 (message):
4.941         The C compiler
4.941       
4.941           "/usr/bin/cc"
4.941       
4.941         is not able to compile a simple test program.
4.941       
4.941         It fails with the following output:
4.941       
4.941           Change Dir: '/tmp/tmp839s_tl2/build/CMakeFiles/CMakeScratch/TryCompile-UGSEbu'
4.941       
4.941           Run Build Command(s): /tmp/pip-build-env-qdg4zwxu/normal/lib/python3.11/site-packages/ninja/data/bin/ninja -v cmTC_b1c9c
4.941           [1/2] /usr/bin/cc    -o CMakeFiles/cmTC_b1c9c.dir/testCCompiler.c.o -c /tmp/tmp839s_tl2/build/CMakeFiles/CMakeScratch/TryCompile-UGSEbu/testCCompiler.c
4.941           [2/2] : && /usr/bin/cc   CMakeFiles/cmTC_b1c9c.dir/testCCompiler.c.o -o cmTC_b1c9c   && :
4.941           FAILED: cmTC_b1c9c
4.941           : && /usr/bin/cc   CMakeFiles/cmTC_b1c9c.dir/testCCompiler.c.o -o cmTC_b1c9c   && :
4.941           /usr/bin/ld: cannot find Scrt1.o: No such file or directory
4.941           /usr/bin/ld: cannot find crti.o: No such file or directory
4.941           collect2: error: ld returned 1 exit status
4.941           ninja: build stopped: subcommand failed.
4.941       
4.941       
4.941       
4.941       
4.941       
4.941         CMake will not be able to correctly generate this project.
4.941       Call Stack (most recent call first):
4.941         CMakeLists.txt:3 (project)
4.941       
4.941       
4.941       CMake Error at CMakeLists.txt:3 (project):
4.941         No CMAKE_CXX_COMPILER could be found.
4.941       
4.941         Tell CMake where to find the compiler by setting either the environment
4.941         variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
4.941         to the compiler, or to the compiler name if it is in the PATH.
4.941       
4.941       
4.941       -- Configuring incomplete, errors occurred!
4.941       
4.941       *** CMake configuration failed
4.941       [end of output]
4.941   
4.941   note: This error originates from a subprocess, and is likely not a problem with pip.
4.941   ERROR: Failed building wheel for llama-cpp-python
4.941 Failed to build llama-cpp-python
4.941 ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
------
Dockerfile:11
--------------------
   9 |     ENV CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1
  10 |     
  11 | >>> RUN pip install -U llama-cpp-python --no-cache-dir
  12 |     
  13 |     RUN pip install 'llama-cpp-python[server]'
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install -U llama-cpp-python --no-cache-dir" did not complete successfully: exit code: 1

UPDATE 2: With the line in my Dockerfile:

RUN apt-get update && apt-get install -y build-essential

this is the error trace when building the Dockerfile:

[+] Building 17.5s (10/14)                                                             docker:desktop-linux
 => [internal] load build definition from Dockerfile                                                   0.0s
 => => transferring dockerfile: 566B                                                                   0.0s
 => [internal] load metadata for docker.io/library/python:3.11-slim                                    0.9s
 => [auth] library/python:pull token for registry-1.docker.io                                          0.0s
 => [internal] load .dockerignore                                                                      0.0s
 => => transferring context: 2B                                                                        0.0s
 => CACHED [1/9] FROM docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf7  0.0s
 => => resolve docker.io/library/python:3.11-slim@sha256:90f8795536170fd08236d2ceb74fe7065dbf74f738d8  0.0s
 => [internal] load build context                                                                      0.0s
 => => transferring context: 2.25kB                                                                    0.0s
 => [2/9] RUN apt-get update && apt-get install -y build-essential                                    10.1s
 => [3/9] WORKDIR /code                                                                                0.0s 
 => [4/9] RUN pip uninstall llama-cpp-python -y                                                        0.9s 
 => ERROR [5/9] RUN pip install -U llama-cpp-python --no-cache-dir                                     5.5s 
------                                                                                                      
 > [5/9] RUN pip install -U llama-cpp-python --no-cache-dir:                                                
0.633 Collecting llama-cpp-python                                                                           
0.765   Downloading llama_cpp_python-0.2.57.tar.gz (36.9 MB)                                                
1.231      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.9/36.9 MB 109.6 MB/s eta 0:00:00                      
1.514   Installing build dependencies: started                                                              
2.479   Installing build dependencies: finished with status 'done'
2.479   Getting requirements to build wheel: started
2.530   Getting requirements to build wheel: finished with status 'done'
2.534   Installing backend dependencies: started
4.027   Installing backend dependencies: finished with status 'done'
4.028   Preparing metadata (pyproject.toml): started
4.119   Preparing metadata (pyproject.toml): finished with status 'done'
4.161 Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
4.182   Downloading typing_extensions-4.10.0-py3-none-any.whl.metadata (3.0 kB)
4.355 Collecting numpy>=1.20.0 (from llama-cpp-python)
4.374   Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (62 kB)
4.376      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.3/62.3 kB 121.2 MB/s eta 0:00:00
4.416 Collecting diskcache>=5.6.1 (from llama-cpp-python)
4.436   Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
4.472 Collecting jinja2>=2.11.3 (from llama-cpp-python)
4.491   Downloading Jinja2-3.1.3-py3-none-any.whl.metadata (3.3 kB)
4.549 Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
4.569   Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (3.0 kB)
4.594 Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
4.596    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.5/45.5 kB 295.7 MB/s eta 0:00:00
4.618 Downloading Jinja2-3.1.3-py3-none-any.whl (133 kB)
4.621    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.2/133.2 kB 138.4 MB/s eta 0:00:00
4.646 Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.2 MB)
4.820    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.2/14.2 MB 76.3 MB/s eta 0:00:00
4.843 Downloading typing_extensions-4.10.0-py3-none-any.whl (33 kB)
4.868 Downloading MarkupSafe-2.1.5-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (29 kB)
4.910 Building wheels for collected packages: llama-cpp-python
4.911   Building wheel for llama-cpp-python (pyproject.toml): started
5.212   Building wheel for llama-cpp-python (pyproject.toml): finished with status 'error'
5.214   error: subprocess-exited-with-error
5.214   
5.214   × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
5.214   │ exit code: 1
5.214   ╰─> [32 lines of output]
5.214       *** scikit-build-core 0.8.2 using CMake 3.29.0 (wheel)
5.214       *** Configuring CMake...
5.214       loading initial cache file /tmp/tmp5q9e9my4/build/CMakeInit.txt
5.214       -- The C compiler identification is GNU 12.2.0
5.214       -- The CXX compiler identification is GNU 12.2.0
5.214       -- Detecting C compiler ABI info
5.214       -- Detecting C compiler ABI info - done
5.214       -- Check for working C compiler: /usr/bin/cc - skipped
5.214       -- Detecting C compile features
5.214       -- Detecting C compile features - done
5.214       -- Detecting CXX compiler ABI info
5.214       -- Detecting CXX compiler ABI info - done
5.214       -- Check for working CXX compiler: /usr/bin/c++ - skipped
5.214       -- Detecting CXX compile features
5.214       -- Detecting CXX compile features - done
5.214       -- Could NOT find Git (missing: GIT_EXECUTABLE)
5.214       CMake Warning at vendor/llama.cpp/scripts/build-info.cmake:14 (message):
5.214         Git not found.  Build info will not be accurate.
5.214       Call Stack (most recent call first):
5.214         vendor/llama.cpp/CMakeLists.txt:132 (include)
5.214       
5.214       
5.214       -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
5.214       -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
5.214       -- Found Threads: TRUE
5.214       CMake Error at vendor/llama.cpp/CMakeLists.txt:191 (find_library):
5.214         Could not find FOUNDATION_LIBRARY using the following names: Foundation
5.214       
5.214       
5.214       -- Configuring incomplete, errors occurred!
5.214       
5.214       *** CMake configuration failed
5.214       [end of output]
5.214   
5.214   note: This error originates from a subprocess, and is likely not a problem with pip.
5.215   ERROR: Failed building wheel for llama-cpp-python
5.215 Failed to build llama-cpp-python
5.215 ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
------
Dockerfile:11
--------------------
   9 |     ENV CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1
  10 |     
  11 | >>> RUN pip install -U llama-cpp-python --no-cache-dir
  12 |     
  13 |     RUN pip install 'llama-cpp-python[server]'
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install -U llama-cpp-python --no-cache-dir" did not complete successfully: exit code: 1

0

There are 0 best solutions below