Ecosystem map · Updated May 6, 2026

The local AI agent ecosystem

Six zones covering the surfaces a developer touches when building or deploying agents that run partly or wholly on local hardware. Catalog entries are linked from each card; deeper architecture references are in /systems.

By Fredoline Eruo · Reviewed monthly

Coding agents

Tools that take a problem statement and produce code changes — branches, edits, PRs. The 2026 lineup splits into closed leaders (Claude Code, Cursor, GitHub Copilot) and open challengers (OpenHands, Aider, Cline, Goose). Local-LLM support varies sharply.

agent

Claude Code

★ 30k

Anthropic's terminal-native coding agent. Tops SWE-bench Verified at 87.6% and SWE-bench Pro at 64.3% in 2026. Deep MCP integration, agentic file editing, and a $20/mo Pro tier are

ide

Cursor

Anysphere's AI-native IDE. Forks VS Code with Cursor Tab inline completion, agentic chat, and background agents. Best 'flow' for inline completion in 2026.

agentOSS

OpenHands

★ 72k

AI-driven development agent that completes engineering tasks end-to-end — branches, code, PRs. v1.6 added a Planning Mode that drafts a plan before executing. Local-LLM-friendly vi

agentOSS

Aider

★ 30k

Terminal-based AI pair programmer. Run in your project directory, describe a change, it edits files and creates meaningful git commits. Works with any LLM — local Ollama, Anthropic

agentOSS

Cline

★ 50k

VS Code extension agent — ~4M installs in 2026. Plan/Act mode, autonomous file edits with diff approval, terminal access. The leading open-source IDE agent.

agentOSS

Continue

★ 25k

Open-source VS Code and JetBrains assistant. Configurable autocomplete + chat + agent modes. Strong with local Ollama backends.

agentOSS

Goose

★ 18k

Open-source extensible AI agent now governed by the Agentic AI Foundation (AAIF) at the Linux Foundation. Started inside Block (formerly Square). 25+ provider support including Oll

agentOSS

Roo Code (sunsetting May 15, 2026)

★ 16k

Open-source AI dev-team extension for VS Code (1.55M installs, 23.8k GitHub stars). **Discontinued: all Roo Code products — Extension, Cloud, and Router — shut down on May 15, 2026

Personal AI agents

The non-coding side: assistants that connect models to messaging surfaces, productivity apps, and long-running task workflows. OpenClaw is the runaway 2026 release here.

orchestratorOSS

OpenClaw

Personal AI agent with a local-first gateway architecture. Connects your local LLMs (Ollama, llama.cpp) to the messaging surfaces you already use — WhatsApp, Telegram, Slack, Disco

agent

Claude Desktop

Anthropic's official desktop app for Claude. Native MCP server support means you can plug in local file access, GitHub, and custom tools. Distinct from the Claude Code CLI.

guiOSS

AnythingLLM

★ 32k

Document-oriented LLM frontend with workspaces. Connects to Ollama, LM Studio, OpenAI, Anthropic, etc. Strong document RAG.

Memory frameworks

Agents that remember across sessions need a memory layer. The 2026 split is between drop-in APIs (Mem0), OS-style explicit management (Letta), and graph-based reasoning (Mem0g, Zep / Graphiti).

agentOSS

Mem0 (agent memory API)

★ 28k

Drop-in memory layer for LLM agents. Vector + graph memory variants (Mem0g) — the graph variant builds a directed labeled knowledge graph alongside the vector store, with conflict

agentOSS

Letta (memory framework)

★ 18k

Agent memory framework that models memory like an operating system. Main context = RAM, archival storage = disk; the agent itself decides when to page. Originally MemGPT, now Letta

MCP protocol layer

The open standard that ties LLM clients to external tools. 500+ public servers. Dive into the protocol details before deploying — see our MCP system guide for architecture, lifecycle, and security.

agentOSS

Model Context Protocol (MCP)

★ 30k

Open protocol for LLM clients to talk to external tools and data sources. The 'USB-C for AI' that became the default in 2026 — supported by Anthropic, OpenAI, and Google DeepMind,

Local inference runtimes

The runtime that hosts the model weights. llama.cpp / Ollama for accessibility, vLLM for throughput, MLX on Apple, ExLlamaV2 for ExLlama-quant speed. The choice you make here constrains which agents and memory frameworks pair cleanly.

runnerOSS

Ollama

★ 130k

The default first-pull tool for local AI. One-line model installs (`ollama run llama3.1`), an OpenAI-compatible HTTP API, good defaults out of the box. Built on llama.cpp.

runnerOSS

llama.cpp

★ 90k

The bedrock of local LLM inference. Most other tools wrap or embed it. Maximum control, maximum platform support, sharpest learning curve.

serverOSS

vLLM

★ 50k

High-throughput serving engine. PagedAttention, continuous batching, prefix caching. Production default for self-hosted LLM APIs at scale.

gui

LM Studio

Polished desktop GUI for local LLMs. Built-in HuggingFace search, OpenAI-compatible local server, side-by-side conversations.

Distributed + P2P inference

The newest zone. Hyperspace pioneered consumer-device P2P inference; vLLM remains the production multi-node standard. Watch this category — it's where the next moat shifts.

serverOSS

Hyperspace (P2P inference network)

★ 12k

Decentralized peer-to-peer AI inference network. 2.7M+ CLI downloads, 2M+ active nodes globally as of April 2026. Three-tier model routing (local registry → DHT → gossip broadcast)

How this map updates

This page reads its zones live from the catalog. When a new tool ships and lands in our scripts/seed/agents.ts or scripts/seed/tools.ts file, it shows up here automatically. The editorial framing — zone titles, blurbs, "what changed" — is hand-written and refreshed on the first business day of each month. If the ecosystem shifts mid-cycle (a major release, a deprecation, a new zone emerging), we update sooner.

Going deeper

What MCP is really solving — protocol-engineering depth on the integration layer that ties this whole map together.
Will-it-run methodology — the math behind our hardware compatibility predictions, including confidence grading.
Benchmark dataset — measured tokens-per-second across the runtimes mapped above.