nvidia
GPU
13824GB VRAM
workstation
NVIDIA GB200 NVL72
72-GPU Blackwell rack with 36 Grace CPUs. Hyperscale-only — relevant context here for understanding 'what frontier training runs on'.
Released 2024
Overview
72-GPU Blackwell rack with 36 Grace CPUs. Hyperscale-only — relevant context here for understanding 'what frontier training runs on'.
Specs
| VRAM | 13824 GB |
| Power draw | 120000 W |
| Released | 2024 |
| Backends | CUDA |
Models that fit
Open-weight models small enough to run on NVIDIA GB200 NVL72 with usable context.
Frequently asked
What models can NVIDIA GB200 NVL72 run?
With 13824GB VRAM, the NVIDIA GB200 NVL72 runs 70B models in 4-bit quantization, plus everything smaller. See the model list below for tested combinations.
Does NVIDIA GB200 NVL72 support CUDA?
Yes — NVIDIA GB200 NVL72 is an NVIDIA card with full CUDA support, the most mature local-AI backend. llama.cpp, Ollama, vLLM, and ExLlamaV2 all run natively.
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify hardware specifications.