Build / compile failures

exllamav2 ImportError: cannot import name 'ExLlamaV2' / undefined symbol

Q: How do you fix "exllamav2 ImportError: cannot import name 'ExLlamaV2' / undefined symbol"?

**1. Match wheel to environment.** The exllamav2 release page lists wheels by CUDA/torch/Python: ```bash # Identify your stack python -c "import torch; print(torch.version.cuda, torch.__version__)" python --version # Pick the matching wheel from https://github.com/turboderp-org/exllamav2/releases pip install exllamav2 \ --extra-index-url https://github.com/turboderp-org/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu124.torch2.4.0-cp311-cp311-linux_x86_64.whl ``` **2. Or build from source against your exact PyTorch:** ```bash pip uninstall exllamav2 -y pip install exllamav2 --no-binary exllamav2 ``` Requires CUDA toolkit (`nvcc`) installed and on PATH. **3. Use TabbyAPI** if you just want an OpenAI-compatible server — it bundles exllamav2 with the right wheels and avoids the manual matching: ```bash git clone https://github.com/theroyallab/tabbyAPI && cd tabbyAPI ./start.sh ``` **4. Confirm the version** after install: ```bash python -c "from exllamav2 import ExLlamaV2; print('ok')" ```

ImportError: cannot import name 'ExLlamaV2' from 'exllamav2'

By Fredoline Eruo · Last verified May 8, 2026

Cause

exllamav2 ships pre-built wheels for specific CUDA + PyTorch + Python combinations. A mismatch on any of those falls back to source compile, which fails (or imports a broken extension) in many environments.

Common forms: undefined symbol errors mean the .so was built against a different PyTorch ABI; ImportError means the C extension didn't build at all and the Python package is half-installed.

Solution

1. Match wheel to environment. The exllamav2 release page lists wheels by CUDA/torch/Python:

# Identify your stack
python -c "import torch; print(torch.version.cuda, torch.__version__)"
python --version

# Pick the matching wheel from https://github.com/turboderp-org/exllamav2/releases
pip install exllamav2 \
  --extra-index-url https://github.com/turboderp-org/exllamav2/releases/download/v0.2.4/exllamav2-0.2.4+cu124.torch2.4.0-cp311-cp311-linux_x86_64.whl

2. Or build from source against your exact PyTorch:

pip uninstall exllamav2 -y
pip install exllamav2 --no-binary exllamav2

Requires CUDA toolkit (nvcc) installed and on PATH.

3. Use TabbyAPI if you just want an OpenAI-compatible server — it bundles exllamav2 with the right wheels and avoids the manual matching:

git clone https://github.com/theroyallab/tabbyAPI && cd tabbyAPI
./start.sh

4. Confirm the version after install:

python -c "from exllamav2 import ExLlamaV2; print('ok')"

Related errors

Did this fix it?

If your case was different, email support@runlocalai.co with what you saw and we'll update the page. If it worked but took different commands on your platform, we want to know that too.