RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
  • Will it run?
GUIDES
  • Best GPU
  • Best laptop
  • Best Mac
  • Best used GPU
  • Best budget GPU
  • Best GPU for Ollama
  • Best GPU for SD
  • AI PC build $2K
  • CUDA vs ROCm
  • 16 vs 24 GB
  • Compare hardware
  • Custom compare
REF
  • Systems
  • Ecosystem maps
  • Pillar guides
  • Methodology
  • Glossary
  • Errors KB
  • Troubleshooting
  • Resources
  • Public API
EDITOR
  • About
  • About the author
  • Changelog
  • Latest
  • Updates
  • Submit benchmark
  • Send feedback
  • Trust
  • Editorial policy
  • How we make money
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned
RUNLOCALAI · v38

Model Families

Ecosystem hubs for the major model families — 26 families across 7 modalities. For each family: architecture evolution, local viability, runtime + quantization support, finetune ecosystem maturity, deployment caveats.

26 families
7 modalities
Open-weight focus

📝Text & Reasoning
16

Llama

Open-weight

Meta

Meta's open-weight LLM family — the dominant baseline for self-hosted text generation. Llama 3.3 70B is the canonical 70B-class chat model in 2026; Llama 3.1 8B remains the most-deployed sub-13B production model.

Qwen

Open-weight

Alibaba (Qwen Team)

Alibaba's flagship open-weight family with permissive licensing across most variants. Qwen 3 235B-A22B is the leading open-weight MoE for production reasoning; Qwen 3 32B dense is the strongest 32B-class chat model.

DeepSeek

Open-weight

DeepSeek AI

DeepSeek's frontier reasoning + code MoE family. DeepSeek V3 (671B MoE) and V4 are the leading open-weight reasoning models in 2026; DeepSeek Coder V3 is the canonical open-weight code model. All ship under DeepSeek's permissive commercial-friendly license.

Mistral

Mixed

Mistral AI

Mistral's mixed open + closed family. Mistral 7B + Mixtral 8x22B are the open-weight standards; Mistral Large + Codestral are commercial. Codestral Mamba 7B introduced state-space models to production code workflows.

Gemma

Open-weight

Google DeepMind

Google's open-weight derivative of Gemini research. Gemma 2 + Gemma 3 cover sub-30B chat; CodeGemma adds code-specialized variants. Tight integration with Vertex AI / Google Cloud / Android Studio.

Phi

Open-weight

Microsoft Research

Microsoft's small-but-strong family — Phi-4, Phi-3.5 lineage. Trained on synthetic data for reasoning. The canonical 'punching above weight class' family — Phi-4 14B competes with 70B-class models on reasoning benchmarks.

Command R

Open-weight

Cohere

Cohere's RAG-tuned open-weight family with first-class document citation + multilingual coverage. Non-commercial license is the practical limit vs Llama / Qwen for production use.

Aya

Open-weight

Cohere For AI

Cohere For AI's multilingual research family. Aya 23 + Aya Expanse cover 23+ languages with explicit balance — the strongest open-weight multilingual chat models for underserved languages (Arabic, Korean, Hebrew, Vietnamese).

Yi

Open-weight

01.AI

01.AI's open-weight family. Yi 1.5 34B was a strong dense model; Yi-Lightning shifted to MoE. Permissive Apache 2.0 license; strong Chinese-English capability.

Nemotron

Open-weight

NVIDIA

NVIDIA's reasoning-tuned family. Nemotron-3, Nemotron-4 lineage. NVIDIA-aligned tooling integration (NeMo, TensorRT-LLM); strong on agentic + reasoning workloads.

DBRX

Open-weight

Databricks (Mosaic)

Databricks' MoE family — DBRX Base + DBRX Instruct. 132B total / 36B active MoE. Surpassed by 2026 MoE leaders but remains relevant for Databricks platform integration.

ERNIE

Mixed

Baidu

Baidu's family — ERNIE 4.0, ERNIE Turbo. Strong Chinese-language capability; mixed licensing. Lower deployment in Western open-weight ecosystems; relevant for Chinese-market deployments.

Baichuan

Open-weight

Baichuan Intelligence

Chinese-market focused open-weight family. Baichuan 4 + earlier versions; strong Chinese capability with permissive licensing.

GPT (OpenAI)

Closed-weight

OpenAI

OpenAI's closed-weight family — GPT-4, GPT-4o, GPT-5 series. Reference baseline for capability comparisons; not self-hostable. RunLocalAI covers GPT family for benchmark context only.

Claude (Anthropic)

Closed-weight

Anthropic

Anthropic's closed-weight family — Claude 3 Haiku/Sonnet/Opus, Claude 3.5/3.7 Sonnet. Reference baseline for reasoning + tool-use comparisons; not self-hostable. RunLocalAI covers Claude for benchmark context.

Gemini (Google)

Closed-weight

Google DeepMind

Google's closed-weight family — Gemini Pro, Gemini Ultra, Gemini Flash. Open-weight derivatives ship as [Gemma](/families/gemma). Reference baseline for capability comparisons.

👁️Vision-Language
3

Qwen-VL

Open-weight

Alibaba (Qwen Team)

Alibaba's multimodal vision-language family. Qwen2.5-VL, Qwen2-VL — the canonical open-weight VLMs for OCR + document understanding + chart reading + UI analysis. Qwen2.5-VL beats specialized OCR engines on complex layouts.

InternVL

Open-weight

OpenGVLab (Shanghai AI Lab)

OpenGVLab's open VLM family. InternVL 2.5 series spans 1B to 78B. Strong multilingual vision capability; dense alternative to Qwen-VL.

LLaVA

Open-weight

Microsoft Research + community

The pioneering open-weight VLM family. LLaVA-1.5, LLaVA-Next, LLaVA-OneVision. Established the VLM training recipe that Qwen-VL + InternVL refined.

🎨Image Generation
2

Flux

Mixed

Black Forest Labs

Black Forest Labs' rectified-flow image-gen family. Flux Dev + Flux Schnell are the open-weight leaders for text-to-image in 2026; text rendering quality is meaningfully better than SDXL/SD3.5.

Stable Diffusion

Mixed

Stability AI

The pioneering open-weight image-gen family. SDXL remains widely deployed; SD 3.5 Large is the architectural successor. Massive finetune ecosystem (Pony, Illustrious, NoobAI, dozens of community models).

🎬Video
2

Wan

Open-weight

Alibaba (Wan AI)

Alibaba's open-weight video generation family. Wan 2.1 covers text-to-video and image-to-video at frontier-tier open-weight quality.

Hunyuan

Open-weight

Tencent

Tencent's HunyuanVideo + Hunyuan3D family. HunyuanVideo is a 13B parameter open-weight video model; Hunyuan3D-2 is a leading text-to-3D + image-to-3D system.

🔊Audio
1

Whisper

Open-weight

OpenAI

OpenAI's open-weight speech-to-text family. Whisper Large v3 + Whisper Turbo are the canonical open-weight STT models. faster-whisper + WhisperX deliver production-grade speed via CTranslate2 backends.

🔍Embeddings & Retrieval
1

BGE (BAAI General Embedding)

Open-weight

BAAI (Beijing Academy of AI)

BAAI's open-weight embedding family. BGE-M3 is the canonical multilingual embedding model in 2026 (100+ languages, 8K context); BGE Reranker V2 M3 is the canonical companion cross-encoder reranker. The default open-weight RAG retrieval stack.

💻Coding
1

StarCoder

Open-weight

BigCode (HuggingFace + ServiceNow)

The BigCode community's open-weight code family. StarCoder 2 + earlier StarCoder. Permissive license + transparent training data — the 'reproducible code model' canonical pick.