What plugs into your local AI runtime. 37 curated apps across 12 categories — chat UIs, coding agents, RAG pipelines, voice, image, browser extensions, editor plugins, mobile + desktop, agent frameworks, productivity, SDK wrappers.
Each entry carries an honest editorial verdict — pros, cons, the runtime it works against, the minimum VRAM, and the privacy posture. Filter to your stack, jump to the detail page, ship.
URL updates as you change filters — share or bookmark a result. All filters are server-rendered, so the page works without JS.
37 of 37 apps
The default chat UI for solo Ollama users. Multi-model, built-in RAG, web search, Docker-friendly.
“Best default chat UI for solo Ollama users. Pick this first; switch only if you outgrow it.”
The default agent framework. Pipelines, retrievers, tool-calling — works against any local backend.
“The default agent framework. Heavy on abstractions, deep ecosystem — pick this if you want defaults.”
Terminal coding agent that edits files via your local model. Git-aware, surgical, fast.
“Best terminal-native coding agent for local models. Qwen 2.5 Coder 32B is its sweet spot.”
Desktop app that bundles model download + chat + OpenAI-compatible local server. Closed-source but free.
“Best 'first install' desktop app for newcomers. Closed-source but the easiest first-run experience.”
VS Code extension that runs a full agent loop locally — reads, writes, runs commands, asks first.
“Best IDE-integrated agent that fully respects 'all local' as a first-class option.”
Privacy-first desktop chat with a curated model catalog. Llama / Mistral / Qwen one click from the app.
“Best one-binary desktop chat. Curated catalog removes 'which model?' decision paralysis.”
Docs-aware chat with workspaces. Drop a folder of PDFs, get a working RAG chatbot in 5 minutes.
“Best fast-RAG app. Workspace model is the right abstraction for doc-corpora chat.”
RAG-first agent framework. Better defaults than LangChain for doc-corpora work; same local-runtime story.
“Best agent framework for RAG-first workloads. Less abstraction than LangChain.”
Open-source autocomplete + chat for VS Code and JetBrains. Local-model-first.
“Best Copilot replacement that defaults to local. Configurable; pair with Qwen 2.5 Coder.”
Air-gappable RAG over your docs. The OG offline-RAG project, now mature and team-friendly.
“Best when air-gap compliance is the requirement. Less polished than AnythingLLM, more configurable.”
Drop-in OpenAI-compatible proxy across 100+ providers. Route to local Ollama or cloud, same code.
“Best universal LLM proxy. Foundational layer for multi-provider deployments.”
Free, native macOS / iOS Stable Diffusion app. Runs SD3, Flux on a phone (yes, really).
“Best mobile + macOS SD app. Free, native, no Python — runs Flux on Apple Silicon impressively well.”
Nomic's free desktop AI with model catalog + chat + Python SDK. Long-standing, open-source.
“Best fully-open-source desktop AI bundler. Less polished than LM Studio, fully MIT.”
Open-source clone of the ChatGPT UI with multi-provider routing. Local + cloud in one interface.
“Best if you mix local + cloud models in the same workflow. Strong team features.”
Official Python SDK for Ollama. Async, streaming, typed — the right primitive for scripts.
“Foundational primitive for Python scripts against Ollama. Official, maintained, typed.”
Self-hosted coding agent server with team SSO, audit logs, and dashboards. Enterprise-grade.
“Best self-hosted server for teams. SSO + audit logs make it the IT-friendly pick.”
Production-leaning multi-agent framework. Role + goal + task — opinionated and ergonomic.
“Best ergonomic multi-agent framework. Picks defaults you'd otherwise have to argue about.”
Obsidian plugin that wires Ollama / OpenAI into your notes. Inline chat, summarize, prompt-templates.
“Best Obsidian plugin for local LLM in your notes. Pair with Smart Connections for RAG.”
Self-hosted AI assistant for your notes, emails, docs. Web + mobile + desktop, all local-first.
“Best 'AI second brain' app. Self-hosted, local-first, works against Obsidian.”
Native iOS / macOS Ollama client. Beautiful SwiftUI, talks to your home Ollama server.
“Best mobile Ollama client. Native SwiftUI; works against your home Ollama server.”
Microsoft's multi-agent framework. Conversation-first orchestration of role-played agents.
“Best for multi-agent role-played workflows. Niche; not the default agent framework.”
Native macOS app for Whisper transcription. Drag a file in, get a transcript out.
“Best Whisper desktop app on macOS. Pay once, transcribe locally forever.”
Local semantic search across all your Obsidian notes. Embed-once, query-fast, fully offline.
“Best local semantic search for personal notes. Foundational layer for Obsidian RAG.”
Open-source Whisper transcription with mic + file modes. Cross-platform Qt app.
“Best open-source Whisper desktop app. Cross-platform, free, less polish than MacWhisper.”
Weaviate's open-source RAG demo turned production. Strong defaults, opinionated stack.
“Best for 'don't make me choose chunking strategy' teams. Opinionated stack works.”
Krita plugin that wires ComfyUI into a real digital-art workflow. Inpaint, outpaint, upscale.
“Best 'SD as digital-art tool' integration. Real Krita workflow, not a wrapper UI.”
Official Node + browser SDK for Ollama. ESM-first, typed, streaming.
“Foundational primitive for Node + browser apps against Ollama. ESM-native, typed.”
Browser sidebar that talks to your local Ollama. Summarize pages, chat, vision support.
“Best 'sidebar AI' browser extension that's truly local-first.”
Desktop app focused on side-by-side multi-model chat. Compare local vs cloud answers in one view.
“Best 'compare local vs cloud answers' workflow. Niche but well-designed.”
AI note-taking app that builds connections between your notes automatically. Local, open-source.
“Best AI-first note app that's actually local. Niche but well-executed.”
Free, lightweight VS Code copilot that runs entirely on Ollama. Strong on autocomplete.
“Best minimal-surface Copilot-replacement that's been Ollama-native since day one.”
One-click Stable Diffusion app for macOS. No setup, just run.
“Easiest macOS SD app — picks defaults so you don't have to.”
Drop-in OpenAI TTS-compatible server. Self-hosted, talks to local voice models.
“Best 'drop-in local TTS for OpenAI clients'. Bridge solution for existing pipelines.”
Brave's built-in AI assistant. Configurable to talk to local Ollama out of the box.
“Best built-in browser AI for Brave users. Local mode is a checkbox, not a hack.”
Android Ollama client + on-device fallback for small models. Cross-platform Flutter.
“Best cross-platform Android-friendly Ollama client. Falls back to on-device for tiny models.”
Codeium self-hosted enterprise backend lets the popular IDE plugin run fully on your hardware.
“Best 'enterprise Copilot' replacement when self-hosting is mandatory. Paid tier.”
Terminal entry into Khoj's local AI assistant. Use grep, get answers, never leave the shell.
“Best terminal companion for note-summarization workflows. Pipe-friendly.”
We curate this directory editorially — same review queue as the benchmarks feed. Open an issue with the project link and your one-line pitch for why it belongs.
Editorial review applies — same standards as the rest of the site. We won't list apps that don't actually work against a local runtime, regardless of marketing claims.
The runtime layer: Ollama, vLLM, llama.cpp, MLX, LM Studio server, ComfyUI. What apps in this directory talk to.
Tell us your use case, get a full rig recipe — runtime + models + the apps from this directory that fit your stack.
Match an app's minimum-VRAM requirement to real hardware with our price/perf comparison.
Real operator submissions on the model × hardware × app combos that work. The proof behind the editorial picks.