CodeGemma 7B
Coding-specialist Gemma. Decent FIM completion. Now mostly historical with Qwen 2.5 Coder dominating.
A small, fast coder for the 8 GB VRAM tier. CodeGemma 7B was good when it shipped; today it's the right choice only when license terms or the Gemma toolchain matter, since Qwen 2.5 Coder 7B and DeepSeek Coder V2 Lite both outperform it.
Strengths- 5 GB at Q4_K_M — runs on 6 GB cards.
- Fast fill-in-the-middle — good for editor autocomplete latency.
- Stable runner support.
- Beaten by Qwen 2.5 Coder 7B on capability.
- Gemma license restrictiveness.
- Repo-context handling weaker than newer coders.
- Q4_K_M (5.0 GB): 95–115 tok/s decode, TTFT under 70 ms
- Q5_K_M (5.9 GB): 84–100 tok/s
- Q8_0 (8.7 GB): 65–80 tok/s
Yes, for 6–8 GB VRAM coders where Qwen license is a problem. No, for new deployments — pick Qwen 2.5 Coder 7B or DeepSeek Coder V2 Lite.
How it compares- vs Qwen 2.5 Coder 7B → Qwen wins on capability; CodeGemma has better Apache-flavored license terms (still Gemma-restricted but cleaner than Qwen).
- vs DeepSeek Coder V2 Lite (16B) → DeepSeek is much more capable; CodeGemma wins on VRAM (5 GB vs ~10 GB).
- vs Codestral 22B → Codestral is dramatically more capable; CodeGemma is the constrained-VRAM pick.
ollama pull codegemma:7b-instruct-q4_K_M
ollama run codegemma:7b-instruct-q4_K_M
Settings: Q4_K_M GGUF, 8192 ctx, llama.cpp/CUDA, RTX 4090
›Why this rating
6.8/10 — Google's coding model in a 7B body. Decent for autocomplete, but Qwen 2.5 Coder 7B exists and is stronger. Loses points for being eclipsed by every modern coder model in similar VRAM.
Overview
Coding-specialist Gemma. Decent FIM completion. Now mostly historical with Qwen 2.5 Coder dominating.
Strengths
- Fast small coder
Weaknesses
- Outpaced by Qwen 2.5 Coder
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 4.2 GB | 6 GB |
Get the model
Ollama
One-line install
ollama run codegemma:7bRead our Ollama review →HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of CodeGemma 7B.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run CodeGemma 7B?
Can I use CodeGemma 7B commercially?
What's the context length of CodeGemma 7B?
How do I install CodeGemma 7B with Ollama?
Source: huggingface.co/google/codegemma-7b-it
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.