Won't fit

Running Mixtral 8x7B Instruct on NVIDIA GeForce RTX 4090

Mixtral 8x7B Instruct requires more memory than NVIDIA GeForce RTX 4090 provides (24 GB available).

By Fredoline Eruo·Last verified May 6, 2026

47B params

24 GB

—

Highest quality that fits

Variants and what fits

Quantization	File size	VRAM required	Fits on NVIDIA GeForce RTX 4090?
Q4_K_M	28.0 GB	32 GB	No
Q5_K_M	33.0 GB	38 GB	No

Tool	Quant	Context	tok/s	VRAM used	Source
Ollama	Q4_K_M	8,192	31.4 tok/s	23.1 GB	owner

Mixtral 8x7B Instruct requires more memory than NVIDIA GeForce RTX 4090 provides (24 GB available).

No quantization of Mixtral 8x7B Instruct fits on NVIDIA GeForce RTX 4090. Pick a smaller model.

Measured at 31.4 tok/s on this combination in our testing.

Reviewed by RunLocalAI Editorial. See our editorial policy.