Won't fit

Running Mixtral 8x7B Instruct on NVIDIA GeForce RTX 4090

Mixtral 8x7B Instruct requires more memory than NVIDIA GeForce RTX 4090 provides (24 GB available).

By Fredoline Eruo·Last verified May 6, 2026

Model size

Memory available

Recommended quant

Highest quality that fits

Variants and what fits

QuantizationFile sizeVRAM requiredFits on NVIDIA GeForce RTX 4090?
Q4_K_M28.0 GB32 GB
No
Q5_K_M33.0 GB38 GB
No

Real benchmarks

ToolQuantContexttok/sVRAM usedSource
OllamaQ4_K_M8,19231.4 tok/s23.1 GB
owner

Frequently asked

Can NVIDIA GeForce RTX 4090 run Mixtral 8x7B Instruct?

Mixtral 8x7B Instruct requires more memory than NVIDIA GeForce RTX 4090 provides (24 GB available).

What quantization should I use?

No quantization of Mixtral 8x7B Instruct fits on NVIDIA GeForce RTX 4090. Pick a smaller model.

How fast will it be?

Measured at 31.4 tok/s on this combination in our testing.

See also: Mixtral 8x7B Instruct, NVIDIA GeForce RTX 4090, all benchmarks.

Reviewed by RunLocalAI Editorial. See our editorial policy.