StarCoder

by BigCode (HuggingFace + ServiceNow)

The BigCode community's open-weight code family. StarCoder 2 + earlier StarCoder. Permissive license + transparent training data — the 'reproducible code model' canonical pick.

Best entry point for local use

Start with StarCoder2 15B via vLLM on RTX 4090 24 GB at FP16 — StarCoder2 is the best open-weight model for fill-in-the-middle (FIM) code completion, trained on The Stack v2 (619 programming languages, 3.3TB source code). The 15B variant fits on a single RTX 4090 at FP16 (30 GB) with half-precision attention and delivers HumanEval pass@1 62.4%. For lower VRAM (<16 GB), StarCoder2 7B fits on RTX 3060 12GB at FP16 (14 GB) and scores HumanEval pass@1 54.2%. For completion-only (no FIM), StarCoder2 3B runs on any GPU at ~6 GB FP16. Skip StarCoder 1 — the v2 architecture adds GQA and doubled context to 16K at 4× training data. StarCoder2 uses BigCode Open RAIL-M license — permits commercial use with ethical use restrictions. The training dataset (The Stack v2) is fully public — best-in-class data transparency.

Deployment guidance

For single-user code completion: llama.cpp server mode with StarCoder2 15B Q4_K_M (~10 GB) on RTX 3060 12GB — FIM requires the <fim_prefix> / <fim_suffix> / <fim_middle> token format. For VS Code integration: Continue + Ollama + starcoder2:15b Q4_K_M — ~15 tok/s tab completion latency. For multi-user FIM serving: vLLM 0.6.0+ on 2× L4 24 GB — FIM processing splits prefix/suffix into separate prefill passes; enable prefix caching to reuse overlapping prefixes across developer sessions. For datacenter: TensorRT-LLM FP8 on L40S 48 GB — ~8,000 tok/s at batch 64, FIM completion latency <100ms. For training/fine-tuning: StarCoder2 uses the SantaCoder tokenizer (49K vocab, code-optimized) — fine-tune with FIM rate at 50% for code completion, 0% for instruction-following. For code-only workflows that don't need FIM, DeepSeek Coder V3 outperforms StarCoder2 on instruction-following code generation but lacks FIM capability at comparable quality.

Recommended runtimes

vLLM

Related families

Mistral

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Runtimes that fit

vLLM →

Alternatives

Mistral

Before you buy

Verify StarCoder runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →