Frontier zone · Model releases

The frontier of open-weight model releases

Open-weight model releases tracked by RunLocalAI — recent additions, rising families, distill chains, multimodal and reasoning waves. Each card links into the catalog with authority badges (L1.25 enriched · benchmark-backed · verdict) so you can scan editorial coverage at a glance.

By Fredoline Eruo · Refreshed continuously from catalog seed

Recent releases (12 newest)

Catalog entries with the most recent release dates. Use the authority badges to spot which have full editorial coverage (L1.25 enriched + benchmark) and which are catalog-only.

New reasoning models

Models with explicit thinking-block emission — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-mode. /stacks/local-reasoning-model for the canonical deployment recipe.

New coding models

Coding-specialized fine-tunes. The Qwen Coder lineage is the current open-weight benchmark leader; DeepSeek Coder V3, Codestral, Devstral, OpenCoder are the credible alternatives. /stacks/local-coding-agent for the canonical deployment recipe.

New multimodal models

Vision-language models. The 2025-2026 wave: Llama 4 Scout / Maverick, Qwen 2.5-VL, Pixtral, Janus-Pro, Phi-4 Multimodal. /stacks/local-vision-model for the canonical deployment recipe.

New MoE models

Mixture-of-Experts releases. Active-parameter efficiency shapes the deployment economics — see /systems/distributed-inference for the architectural depth.

New edge / phone-tier models

Sub-4B models for phone / Pi / embedded deployment. Phi-4 Mini, Gemma 3 1B, MiniCPM 3 4B, SmolLM 3, Hermes 3 3B, Dolphin 3 3B, RWKV 7 Goose 1.5B.

Enrichment gaps — OPERATOR queue

High-relevance catalog entries (7B-100B) that lack L1.25 enrichment, verdict, AND benchmark. These render noindex today — the next sprint's editorial queue. Surfacing them here keeps the gap visible.

Going deeper

  • Ecosystem maps — structured-landscape views (memory frameworks, inference runtimes, MCP, coding agents).
  • Execution stacks — recipes that combine models with runtimes + hardware.
  • Frontier index — broader ecosystem-momentum view across coding agents, inference runtimes, memory systems, MCP.
  • Benchmarks — measured tokens-per-second + topology fields across hardware/model/runtime triples.