The frontier of open-weight model releases

1600B/49B-Afrontier

frontier-tier reasoning research

DeepSeek V4 Flash

AI2 (Allen AI) · 2026-04-12

284B/13B-Adatacenter

workstation-cluster V4-class without frontier hardware

OLMo 2 32B

academic / regulatory-sensitive 32B research

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08

3.8Bedge

edge-tier reasoning

Llama 4 Maverick

Meta · 2026-04-05

400Bfrontier

frontier-tier reasoning + multimodal

VerdictMultimodal

Llama 4 Scout

Meta · 2026-04-05

109Bdatacenter

production multimodal serving — image + text at workstation-cluster scale

L1.25 enrichedVerdict

Gemma 4 31B

31Bworkstation

workstation-tier multilingual + vision

Gemma 4 26B MoE

26Bworkstation

workstation MoE — first-of-kind in Gemma family

Gemma 4 E4B (Effective 4B)

New reasoning models

Models with explicit thinking-block emission — DeepSeek R1 family, QwQ, Kimi, Magistral, Qwen 3 reasoning-mode. /stacks/local-reasoning-model for the canonical deployment recipe.

Kimi K2.6

Moonshot AI · 2026-03-10

1000Bdatacenter

frontier-tier reasoning research

Magistral 32B

Mistral AI · 2025-12-15

research / non-commercial reasoning at 32B scale

Kimi K1.5

Moonshot AI · 2025-12-01

200Bdatacenter

deep math + reasoning research

Qwen 3 Coder 32B

Alibaba · 2025-11-20

coding-specialized agent workloads

DeepSeek R1 Distill Qwen 3 32B

DeepSeek AI · 2025-11-15

workstation reasoning with Qwen 3 base improvements

Qwen 3 72B

Alibaba · 2025-09-15

72Bdatacenter

frontier-tier general reasoning at workstation scale

New coding models

Coding-specialized fine-tunes. The Qwen Coder lineage is the current open-weight benchmark leader; DeepSeek Coder V3, Codestral, Devstral, OpenCoder are the credible alternatives. /stacks/local-coding-agent for the canonical deployment recipe.

DeepSeek Coder V3

DeepSeek AI · 2026-02-08

33Bworkstation

workstation coding alternative to Qwen 2.5 Coder

Devstral Small 2 24B

Mistral AI · 2025-09-25

24Bconsumer

Apache 2.0 coding alternative to Qwen 2.5 Coder

Yi Coder 9B

01.AI · 2025-09-20

9Bconsumer

8GB-VRAM coding

Qwen 2.5 Coder 32B Instruct

Alibaba · 2024-11-12

L1.25 enrichedBenchmarkVerdict

single-user autonomous coding agents on RTX 4090 / 5090 / dual-A100 hardware

Qwen 2.5 Coder 14B Instruct

Alibaba · 2024-11-12

14Bconsumer

16GB-VRAM coding

OpenCoder 8B

INFLY AI · 2024-11-09

8Bconsumer

academic / reproducibility-sensitive coding research

New multimodal models

Vision-language models. The 2025-2026 wave: Llama 4 Scout / Maverick, Qwen 2.5-VL, Pixtral, Janus-Pro, Phi-4 Multimodal. /stacks/local-vision-model for the canonical deployment recipe.

Llama 4 Maverick

Meta · 2026-04-05

400Bfrontier

frontier-tier reasoning + multimodal

VerdictMultimodal

Gemma 4 31B

31Bworkstation

workstation-tier multilingual + vision

Gemma 4 26B MoE

26Bworkstation

workstation MoE — first-of-kind in Gemma family

16GB-consumer multimodal Q&A

New MoE models

Mixture-of-Experts releases. Active-parameter efficiency shapes the deployment economics — see /systems/distributed-inference for the architectural depth.

Qwen 3.5 235B-A17B

Alibaba · 2026-05-01

397B/17B-Afrontier

frontier-tier permissively-licensed serving on cluster hardware

Mistral Medium 3.5 (675B MoE)

Mistral AI · 2026-04-29

675B/41B-A

DeepSeek V4 Pro

1600B/49B-Afrontier

frontier-tier reasoning research

DeepSeek V4 Flash

RWKV community · 2025-02-15

284B/13B-Adatacenter

workstation-cluster V4-class without frontier hardware

DeepSeek V4

DeepSeek AI · 2026-03-15

745B/38B-Afrontier

frontier-tier reasoning on multi-machine clusters

GLM-5 Pro

Zhipu AI · 2026-02-18

144B/16B-Adatacenter

Chinese-language enterprise serving

New edge / phone-tier models

Sub-4B models for phone / Pi / embedded deployment. Phi-4 Mini, Gemma 3 1B, MiniCPM 3 4B, SmolLM 3, Hermes 3 3B, Dolphin 3 3B, RWKV 7 Goose 1.5B.

Phi-4 Reasoning Mini 4B

Microsoft · 2026-04-08

3.8Bedge

edge-tier reasoning

Phi-4 Mini 4B

Microsoft · 2026-02-25

3.8Bedge

edge / embedded reasoning

SmolLM 3 3B

HuggingFace · 2025-11-04

3Bedge

edge-tier reasoning

Gemma 3 4B

Google · 2025-03-12

4Bedge

edge-tier chat — Apple Silicon laptop friendly

VerdictMultimodal

Gemma 3 1B

Google · 2025-03-12

1Bedge

edge / embedded chat

RWKV 7 'Goose' 1.5B

1.5Bedge

long-context edge inference where memory matters more than quality

Enrichment gaps — OPERATOR queue

High-relevance catalog entries (7B-100B) that lack L1.25 enrichment, verdict, AND benchmark. These render noindex today — the next sprint's editorial queue. Surfacing them here keeps the gap visible.

Qwen 3 30B-A3B

Alibaba · 2025-04-29

30Bworkstation

workstation MoE inference — efficient consumer-tier alternative to dense 32B

Gemma 4 31B

31Bworkstation

workstation-tier multilingual + vision

Gemma 4 26B MoE

26Bworkstation

workstation MoE — first-of-kind in Gemma family

Nemotron 3 Nano (30B-A3B)

NVIDIA · 2026-01-15

30B

DeepSeek R1 Distill Qwen 7B

DeepSeek · 2025-01-20

7Bconsumer

consumer-tier reasoning at the 8GB tier

DeepSeek R1 Distill Qwen 14B

DeepSeek · 2025-01-20

14Bconsumer

16GB-VRAM reasoning

Llama 3.1 Nemotron 70B Instruct

NVIDIA · 2024-10-15

70B

Llama 3.2 11B Vision Instruct

Meta · 2024-09-25

11B