llama.cpp build fails: nvcc not found / CUDA toolkit missing
Cause
Building llama.cpp with GGML_CUDA=1 requires the full CUDA toolkit (nvcc compiler, CUDA headers, cuBLAS), not just the NVIDIA driver. nvidia-smi working doesn't mean the toolkit is installed — that's the driver only.
On Ubuntu, apt install nvidia-driver-XXX does NOT install the toolkit. On Windows, the GeForce Experience driver bundle does NOT include nvcc.
Solution
Linux (Ubuntu/Debian):
# Install matching CUDA toolkit (12.4 example — match what nvidia-smi reports)
sudo apt install nvidia-cuda-toolkit
# Or use NVIDIA's repo for a more recent version:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update && sudo apt install cuda-toolkit-12-4
Then ensure nvcc is on PATH:
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
nvcc --version # should print version info
Windows: Install the CUDA Toolkit from developer.nvidia.com/cuda-downloads (separate download from the driver). Use Visual Studio's "x64 Native Tools Command Prompt" so cl.exe is on PATH too. Then:
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j
Fallback if the toolkit won't install (older distro, locked-down system): build with Vulkan instead. Cross-vendor, no toolkit needed:
make GGML_VULKAN=1 -j
Related errors
Did this fix it?
If your case was different, email support@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.