Multi-step autonomous coding agents. Aider, Cline, OpenHands, Continue.dev, Claude Code.
ollama pull qwen2.5-coder:14b (~9 GB — minimum viable for coding agents).pip install aider-chat (Aider — the leading open-source local coding agent).cd /path/to/your/repo && git init (Aider needs a git repo to track changes).aider --model ollama_chat/qwen2.5-coder:14b — opens the aider TUI.Used RTX 3060 12 GB (~$200-250, see /hardware/rtx-3060-12gb). Runs aider with Qwen 2.5 Coder 14B at 25-35 tok/s — handles multi-file tasks (CRUD endpoints, refactors, test writing) on repos up to 50K lines. Each agent step (read file, think, edit, run test) takes 5-15 seconds. A 5-step task completes in 1-2 minutes. Pair with Ryzen 5 5600 + 32 GB DDR4 + 1TB NVMe. Total: ~$400-480. For coding agents specifically, 14B is the minimum for multi-file edits. 7B models get lost across files. $400 gets you a capable coding agent for small-to-medium projects.
Used RTX 3090 24 GB ($700-900, see /hardware/rtx-3090). Runs aider/Cline with DeepSeek Coder V3 at 15-20 tok/s or Qwen 2.5 Coder 32B at 35-50 tok/s — these models handle complex multi-file architectures, database migrations, and integration tests across repos up to 500K lines. For professional developers using AI pair programming daily: the 32B class of models reduces the "fix my own fix" cycle by 50% compared to 14B. Total: ~$1,800-2,200. For the fastest iteration: RTX 4090 ($2,000) + Qwen Coder 32B at 60-80 tok/s — near-instant code generation.
The mistake: Letting the coding agent run autonomously for 20 minutes on a complex refactor, then discovering it deleted features, duplicated code, and introduced circular imports — and git reset won't help because it committed 15 times. Why it fails: Coding agents compound errors. Step 1: makes a small mistake. Step 2: "fixes" the mistake with a workaround that breaks something else. Step 3-15: builds on broken foundations. After 20 minutes, the codebase is a house of cards. The agent is a junior dev without supervision — it needs review at each step. The fix: Use coding agents incrementally. Ask for one small change at a time. Review the diff. Run tests. Commit. Then ask for the next change. This is slower per-step but 10× faster overall because you avoid the compounding-error cleanup. Aider's /undo command helps but only within a session. For Cline, review every file edit before approving. Coding agents are pair programmers — not autonomous contractors.
Browse all tools for runtimes that fit this workload.
Local AI workloads have real hardware constraints that vary by task type. VRAM ceiling decides what model fits; bandwidth decides decode speed; compute decides prefill speed. Pick the GPU tier that fits your actual workload, not the spec sheet.
The errors most operators hit when running coding agents locally. Each links to a diagnose+fix walkthrough.
Verify your specific hardware can handle coding agents before committing money.
Local coding workflows live or die on time-to-first-token and 32K+ context. The guides below cover the developer-specific hardware decision.