exllamav2 ImportError: cannot import name 'ExLlamaV2' / undefined symbol
Cause
exllamav2 ships pre-built wheels for specific CUDA + PyTorch + Python combinations. A mismatch on any of those falls back to source compile, which fails (or imports a broken extension) in many environments.
Common forms: undefined symbol errors mean the .so was built against a different PyTorch ABI; ImportError means the C extension didn't build at all and the Python package is half-installed.
Solution
1. Match wheel to environment. The exllamav2 release page lists wheels by CUDA/torch/Python:
# Identify your stack
python -c "import torch; print(torch.version.cuda, torch.__version__)"
python --version
# Pick the matching wheel from https://github.com/turboderp-org/exllamav2/releases
pip install exllamav2 \
--extra-index-url https://github.com/turboderp-org/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu124.torch2.4.0-cp311-cp311-linux_x86_64.whl
2. Or build from source against your exact PyTorch:
pip uninstall exllamav2 -y
pip install exllamav2 --no-binary exllamav2
Requires CUDA toolkit (nvcc) installed and on PATH.
3. Use TabbyAPI if you just want an OpenAI-compatible server — it bundles exllamav2 with the right wheels and avoids the manual matching:
git clone https://github.com/theroyallab/tabbyAPI && cd tabbyAPI
./start.sh
4. Confirm the version after install:
python -c "from exllamav2 import ExLlamaV2; print('ok')"
Related errors
Did this fix it?
If your case was different, email support@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.