Pick the path that matches your situation.
Eight ordered tracks from zero to running. Each path names the operator it's for, then walks ordered milestones with concrete outcomes per step. No gamification, no badges, no fake levels — just the things you do, in the order you do them, until your local AI stack is running.
Finish a path when its end state matches yours. Skip paths that aren't yours. Cross-reference benchmarks for what to expect on real hardware.
Beginner: run your first local model
Zero to a working local LLM on a laptop with no dedicated GPU. Ollama first, then llama.cpp, then your first quantization choice.
Anyone with a Mac or Windows laptop and zero local-AI experience.
Privacy-first: keep every prompt local
Airgap the inference layer, kill the telemetry, route everything local, and verify you actually achieved it.
Operators with privacy or compliance reasons not to send prompts off-machine.
Homelab: 24/7 local AI you don't babysit
Power, thermals, restarts, observability, and remote access — the operational discipline an always-on box demands.
Owners of a homelab box (Proxmox, bare metal, or NAS-adjacent) who want a stable always-on inference service.
Local coding agent: ship code with a local model
Aider vs Cline vs Continue.dev, model picks per repo size, context budgets, and the failure modes that ruin a Saturday.
Developers who want autonomous or surgical-edit coding agents driven by a local model.
Multi-GPU: scale beyond a single card
Tensor parallel vs pipeline parallel, vLLM vs SGLang, NVLink vs PCIe, and what to actually set in config.
Operators with two or more GPUs (matched or mixed) running larger models or higher concurrency.
Apple Silicon: M2 / M3 / M4 local AI
Ollama Metal as the on-ramp, MLX-LM as the destination, and what unified memory really gives you vs a discrete GPU.
Mac owners on M2, M3, or M4 silicon with 16GB or more unified memory.
AMD ROCm: running local AI on Radeon
Linux-only stability, kernel + driver pinning, and the runtime choices that actually work today on consumer Radeon.
Owners of RX 7900 XTX, 7900 XT, or RX 9070 cards willing to commit to Linux.
Budget laptop: what's actually usable
Honest expectations for integrated GPU + CPU inference, the small-model shortlist that earns its keep, and what to skip.
Owners of laptops with integrated GPU (or 4-8GB discrete) and 16GB system RAM.
Why a path instead of a checklist?
Most local-AI guidance is unsequenced. Ten things you should know, fifty tools to consider, infinite hardware tradeoffs. The result is the operator-onboarding pattern that doesn't work: you open tab after tab, learn isolated facts, never assemble them into a working stack. Two months later the project is dead and the hardware sits idle.
A path is the opposite. Ordered milestones, each with a concrete end state. You finish milestone 1 before reading milestone 2. When the path ends, the stack runs. The path doesn't try to teach you everything; it teaches you the next thing, in the order that unblocks the thing after it. The hardware-specific decisions (which GPU, how much VRAM, used vs new, NVIDIA vs AMD vs Apple) live in the buyer-guide cluster — paths assume you've already made the hardware decision, and walk you through what to do with it.
The eight paths above cover the operator situations that most local-AI builders fall into. They are not exhaustive; they are the patterns that produce a running stack within 1-3 weekends of focused work. If your situation doesn't match any of them cleanly, pick the closest path and adapt — the milestones generalize.
Not sure which path is yours?
Start with the hardware you already have. Run the custom build engine or the GPU chooser first. The path you should follow follows from what you have on your desk: an M-series Mac → Apple Silicon path; a 7900 XTX → AMD ROCm path; two 3090s → multi-GPU path; a Lenovo laptop with integrated graphics → budget laptop path.
If you don't have hardware yet, the buyer-guide cluster is the place to start instead. The best GPU for local AI guide is the canonical pillar; the tier-by-tier hardware guide covers $0 (CPU-only) through $4,000+ (dual 3090). Buy hardware first; pick a path second.
What paths do not promise
A path will not turn an underpowered laptop into a 70B-class inference rig. It will not replace operator judgment about whether local AI is actually the right answer for your workload. And it will not stay current forever — the local-AI ecosystem moves fast enough that any specific tool recommendation is best-of-2026 and revisable. Path milestones stay durable because they describe operator outcomes, not version-pinned tool names.