Yi 1.5 34B
01.AI's 34B model. Solid bilingual EN/ZH performance, Apache 2.0.
A capable 34B-class generalist with good multilingual support, especially Chinese. Right pick for users who specifically want 01.AI's training distribution, or as a known-quantity baseline.
Strengths- 34B fits 24 GB at Q4_K_M — full GPU on a 4090.
- Apache 2.0 license — clean commercial terms.
- Strong Chinese-English — better than Qwen 2.5 32B on Chinese-specific tasks.
- Beaten by Qwen 3 32B on most general benchmarks.
- Long-context recall weaker than spec.
- Knowledge cutoff dated — 2024-era data.
- Q4_K_M (20.7 GB): 65–80 tok/s decode — full GPU
- Q5_K_M (24.4 GB): partial offload, 22–30 tok/s
- Q8_0 (37 GB): workstation territory
Yes, for Chinese-English-specific work, or as an Apache-licensed alternative to Qwen. No, for general English work — Qwen 3 32B is stronger at the same VRAM.
How it compares- vs Qwen 3 32B → Qwen wins on general capability; Yi has cleaner license.
- vs Mistral Small 3 24B → Mistral wins on instruction polish; Yi has slight edge on Chinese.
- vs Llama 3.3 70B → Llama 3.3 70B is much smarter; Yi 34B is the full-GPU pick.
ollama pull yi:34b-chat-v1.5-q4_K_M
ollama run yi:34b-chat-v1.5-q4_K_M
Settings: Q4_K_M GGUF, 8192 ctx, full GPU on RTX 4090
›Why this rating
7.4/10 — 01.AI's 34B that fits 24 GB at Q4. Solid, but mostly eclipsed by Qwen 3 32B and Mistral Small 3 24B. Loses points by sitting in an awkward middle without a clear differentiator.
Overview
01.AI's 34B model. Solid bilingual EN/ZH performance, Apache 2.0.
Strengths
- Apache 2.0
- Bilingual
Weaknesses
- Outpaced by Qwen 2.5 32B
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 20.0 GB | 24 GB |
Get the model
Ollama
One-line install
ollama run yi:34bRead our Ollama review →HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of Yi 1.5 34B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run Yi 1.5 34B?
Can I use Yi 1.5 34B commercially?
What's the context length of Yi 1.5 34B?
How do I install Yi 1.5 34B with Ollama?
Source: huggingface.co/01-ai/Yi-1.5-34B-Chat
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.