→WILL IT RUN BEST GPU COMPARE TROUBLESHOOT START PULSE MODELS HARDWARE TOOLS BENCH

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo

DIR

Models
Hardware
Tools
Benchmarks
Will it run?

GUIDES

Best GPU
Best laptop
Best Mac
Best used GPU
Best budget GPU
Best GPU for Ollama
Best GPU for SD
AI PC build $2K
CUDA vs ROCm
16 vs 24 GB
Compare hardware
Custom compare

REF

Systems
Ecosystem maps
Pillar guides
Methodology
Glossary
Errors KB
Troubleshooting
Resources
Public API

EDITOR

About
About the author
Changelog
Latest
Updates
Submit benchmark
Send feedback
Trust
Editorial policy
How we make money
Contact

LEGAL

Privacy
Terms
Sitemap

MAIL · MONTHLY DIGEST

Get monthly local AI changes

Monthly recap. No spam.

Email address

DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned

RUNLOCALAI · v38

← Home·Runtimes (/tools)

Apps directory

What plugs into your local AI runtime. 37 curated apps across 12 categories — chat UIs, coding agents, RAG pipelines, voice, image, browser extensions, editor plugins, mobile + desktop, agent frameworks, productivity, SDK wrappers.

Each entry carries an honest editorial verdict — pros, cons, the runtime it works against, the minimum VRAM, and the privacy posture. Filter to your stack, jump to the detail page, ship.

Filter the directory

URL updates as you change filters — share or bookmark a result. All filters are server-rendered, so the page works without JS.

§ Category

§ Runtime

§ OS / platform

§ Privacy posture

§ Max VRAM I have

§ Sort by

37 of 37 apps

Open WebUI

The default chat UI for solo Ollama users. Multi-model, built-in RAG, web search, Docker-friendly.

“Best default chat UI for solo Ollama users. Pick this first; switch only if you outgrow it.”

ollamaopenai-compat

Free· 4GB+ VRAM

LangChain

Agent framework

Hybrid (offline or cloud)

The default agent framework. Pipelines, retrievers, tool-calling — works against any local backend.

“The default agent framework. Heavy on abstractions, deep ecosystem — pick this if you want defaults.”

ollamallama-cppvllmopenai-compat

Aider

Terminal coding agent that edits files via your local model. Git-aware, surgical, fast.

“Best terminal-native coding agent for local models. Qwen 2.5 Coder 32B is its sweet spot.”

ollamaopenai-compat

Free· 24GB+ VRAM

LM Studio

Desktop app that bundles model download + chat + OpenAI-compatible local server. Closed-source but free.

“Best 'first install' desktop app for newcomers. Closed-source but the easiest first-run experience.”

Free tier· 4GB+ VRAM

Cline

Hybrid (offline or cloud)

VS Code extension that runs a full agent loop locally — reads, writes, runs commands, asks first.

“Best IDE-integrated agent that fully respects 'all local' as a first-class option.”

ollamaopenai-compatanthropicopenai

Free· 24GB+ VRAM

Jan

Privacy-first desktop chat with a curated model catalog. Llama / Mistral / Qwen one click from the app.

“Best one-binary desktop chat. Curated catalog removes 'which model?' decision paralysis.”

llama-cppollamaopenai-compat

Free· 4GB+ VRAM

AnythingLLM

Hybrid (offline or cloud)

Docs-aware chat with workspaces. Drop a folder of PDFs, get a working RAG chatbot in 5 minutes.

“Best fast-RAG app. Workspace model is the right abstraction for doc-corpora chat.”

ollamalm-studioopenai-compat

Free· 8GB+ VRAM

LlamaIndex

Agent framework

Hybrid (offline or cloud)

RAG-first agent framework. Better defaults than LangChain for doc-corpora work; same local-runtime story.

“Best agent framework for RAG-first workloads. Less abstraction than LangChain.”

ollamallama-cppopenai-compat

Continue

Hybrid (offline or cloud)

Open-source autocomplete + chat for VS Code and JetBrains. Local-model-first.

“Best Copilot replacement that defaults to local. Configurable; pair with Qwen 2.5 Coder.”

ollamaopenai-compatanthropicopenai

Free· 12GB+ VRAM

PrivateGPT

Air-gappable RAG over your docs. The OG offline-RAG project, now mature and team-friendly.

“Best when air-gap compliance is the requirement. Less polished than AnythingLLM, more configurable.”

ollamallama-cppopenai-compat

Free· 8GB+ VRAM

LiteLLM

Hybrid (offline or cloud)

Drop-in OpenAI-compatible proxy across 100+ providers. Route to local Ollama or cloud, same code.

“Best universal LLM proxy. Foundational layer for multi-provider deployments.”

ollamallama-cppopenai-compatanthropic+2

Draw Things

Free, native macOS / iOS Stable Diffusion app. Runs SD3, Flux on a phone (yes, really).

“Best mobile + macOS SD app. Free, native, no Python — runs Flux on Apple Silicon impressively well.”

Free· 8GB+ VRAM

GPT4All

Nomic's free desktop AI with model catalog + chat + Python SDK. Long-standing, open-source.

“Best fully-open-source desktop AI bundler. Less polished than LM Studio, fully MIT.”

Free· 4GB+ VRAM

LibreChat

Hybrid (offline or cloud)

Open-source clone of the ChatGPT UI with multi-provider routing. Local + cloud in one interface.

“Best if you mix local + cloud models in the same workflow. Strong team features.”

ollamaopenai-compatanthropicgemini

Free· 4GB+ VRAM

Ollama Python SDK

Official Python SDK for Ollama. Async, streaming, typed — the right primitive for scripts.

“Foundational primitive for Python scripts against Ollama. Official, maintained, typed.”

Tabby

Self-hosted coding agent server with team SSO, audit logs, and dashboards. Enterprise-grade.

“Best self-hosted server for teams. SSO + audit logs make it the IT-friendly pick.”

llama-cppopenai-compat

Freemium· 16GB+ VRAM

CrewAI

Agent framework

Hybrid (offline or cloud)

Production-leaning multi-agent framework. Role + goal + task — opinionated and ergonomic.

“Best ergonomic multi-agent framework. Picks defaults you'd otherwise have to argue about.”

ollamaopenai-compat

Obsidian Copilot

Hybrid (offline or cloud)

Obsidian plugin that wires Ollama / OpenAI into your notes. Inline chat, summarize, prompt-templates.

“Best Obsidian plugin for local LLM in your notes. Pair with Smart Connections for RAG.”

ollamaopenai-compat

Free· 4GB+ VRAM

Khoj

Hybrid (offline or cloud)

Self-hosted AI assistant for your notes, emails, docs. Web + mobile + desktop, all local-first.

“Best 'AI second brain' app. Self-hosted, local-first, works against Obsidian.”

ollamallama-cppopenai-compatanthropic

Freemium· 8GB+ VRAM

Enchanted

Native iOS / macOS Ollama client. Beautiful SwiftUI, talks to your home Ollama server.

“Best mobile Ollama client. Native SwiftUI; works against your home Ollama server.”

Free· 4GB+ VRAM

AutoGen

Agent framework

Hybrid (offline or cloud)

Microsoft's multi-agent framework. Conversation-first orchestration of role-played agents.

“Best for multi-agent role-played workflows. Niche; not the default agent framework.”

ollamaopenai-compat

MacWhisper

Voice / transcription

Native macOS app for Whisper transcription. Drag a file in, get a transcript out.

“Best Whisper desktop app on macOS. Pay once, transcribe locally forever.”

Paid· 2GB+ VRAM

Smart Connections (Obsidian)

Local semantic search across all your Obsidian notes. Embed-once, query-fast, fully offline.

“Best local semantic search for personal notes. Foundational layer for Obsidian RAG.”

ollamaopenai-compat

Freemium· 4GB+ VRAM

Buzz

Voice / transcription

Open-source Whisper transcription with mic + file modes. Cross-platform Qt app.

“Best open-source Whisper desktop app. Cross-platform, free, less polish than MacWhisper.”

Free· 2GB+ VRAM

Verba

Hybrid (offline or cloud)

Weaviate's open-source RAG demo turned production. Strong defaults, opinionated stack.

“Best for 'don't make me choose chunking strategy' teams. Opinionated stack works.”

ollamaopenai-compatanthropic

Free· 8GB+ VRAM

Krita AI Diffusion

Krita plugin that wires ComfyUI into a real digital-art workflow. Inpaint, outpaint, upscale.

“Best 'SD as digital-art tool' integration. Real Krita workflow, not a wrapper UI.”

Free· 8GB+ VRAM

Ollama JS / TS SDK

Official Node + browser SDK for Ollama. ESM-first, typed, streaming.

“Foundational primitive for Node + browser apps against Ollama. ESM-native, typed.”

Page Assist

Browser extension

Browser sidebar that talks to your local Ollama. Summarize pages, chat, vision support.

“Best 'sidebar AI' browser extension that's truly local-first.”

ollamaopenai-compat

Free· 4GB+ VRAM

Msty

Hybrid (offline or cloud)

Desktop app focused on side-by-side multi-model chat. Compare local vs cloud answers in one view.

“Best 'compare local vs cloud answers' workflow. Niche but well-designed.”

ollamaopenai-compatanthropicopenai+1

Free tier· 4GB+ VRAM

Reor

AI note-taking app that builds connections between your notes automatically. Local, open-source.

“Best AI-first note app that's actually local. Niche but well-executed.”

ollamaopenai-compat

Free· 4GB+ VRAM

Twinny

Free, lightweight VS Code copilot that runs entirely on Ollama. Strong on autocomplete.

“Best minimal-surface Copilot-replacement that's been Ollama-native since day one.”

Free· 8GB+ VRAM

Diffusion Bee

One-click Stable Diffusion app for macOS. No setup, just run.

“Easiest macOS SD app — picks defaults so you don't have to.”

Free· 8GB+ VRAM

OpenedAI-Speech

Voice / transcription

Drop-in OpenAI TTS-compatible server. Self-hosted, talks to local voice models.

“Best 'drop-in local TTS for OpenAI clients'. Bridge solution for existing pipelines.”

Free· 4GB+ VRAM

Leo (Brave)

Browser extension

Hybrid (offline or cloud)

Brave's built-in AI assistant. Configurable to talk to local Ollama out of the box.

“Best built-in browser AI for Brave users. Local mode is a checkbox, not a hack.”

ollamaopenai-compat

Free· 4GB+ VRAM

Maid

Android Ollama client + on-device fallback for small models. Cross-platform Flutter.

“Best cross-platform Android-friendly Ollama client. Falls back to on-device for tiny models.”

ollamallama-cpp

Free· 2GB+ VRAM

Codeium (with local backend)

Codeium self-hosted enterprise backend lets the popular IDE plugin run fully on your hardware.

“Best 'enterprise Copilot' replacement when self-hosting is mandatory. Paid tier.”

Paid· 24GB+ VRAM

Khoj CLI

Terminal entry into Khoj's local AI assistant. Use grep, get answers, never leave the shell.

“Best terminal companion for note-summarization workflows. Pipe-friendly.”

ollamallama-cppopenai-compat

Free· 4GB+ VRAM

Missing an app? Suggest one.

We curate this directory editorially — same review queue as the benchmarks feed. Open an issue with the project link and your one-line pitch for why it belongs.

Open a GitHub issue →Or contribute a benchmark →

Editorial review applies — same standards as the rest of the site. We won't list apps that don't actually work against a local runtime, regardless of marketing claims.

Where to go from here

Runtimes (/tools) →

The runtime layer: Ollama, vLLM, llama.cpp, MLX, LM Studio server, ComfyUI. What apps in this directory talk to.

Stack Builder →

Tell us your use case, get a full rig recipe — runtime + models + the apps from this directory that fit your stack.

GPU chooser →

Match an app's minimum-VRAM requirement to real hardware with our price/perf comparison.

Community benchmarks →

Real operator submissions on the model × hardware × app combos that work. The proof behind the editorial picks.