BLK · MODEL RELEASE TRACKER

What just dropped.

Newest local AI models, date-sorted. Every row carries quick fit verdicts for the four VRAM classes operators ask about — so you know in one glance whether to bother downloading a model before it starts loading. 60 models indexed.

For broader ecosystem news see /pulse. For the recommendation engine see /choose-my-gpu.

60 models shown · newest first

AddedModelParams8GB16GB24GB48GB96GB+
4d agoQwen 2.5 Coder 7B Instruct
qwen · released 2024-11-12
7B
1w agoQwen 3.5 235B-A17B (MoE)
qwen · released 2026-05-01
397B
1w agoDeepSeek V4 Pro (1.6T MoE)
deepseek · released 2026-04-24
1600B
1w agoQwen 2.5-VL 3B
qwen · released 2025-01-26
3B
1w agoDeepSeek Coder V2 236B
deepseek · released 2024-06-17
236B
1w agoQwen 2.5 3B Instruct
qwen · released 2024-09-19
3B
1w agoQwen 2.5 1.5B Instruct
qwen · released 2024-09-19
1.5B
1w agoQwen 2.5 0.5B Instruct
qwen · released 2024-09-19
500M
1w agoDeepSeek V2.5 236B
deepseek · released 2024-09-05
236B
1w agoBGE Reranker v2 M3
other · released 2024-04-15
570M
1w agoQwen 3 Embedding 8B
qwen · released 2025-06-05
8B~
1w agoNV-Embed v2
other · released 2024-09-09
7.85B~
1w agoBGE M3
other · released 2024-01-30
570M
1w agoDBRX Base
dbrx · released 2024-03-27
132B
1w agoEXAONE 3.5 2.4B
exaone · released 2024-12-09
2.4B
1w agoInternLM 2.5 7B Chat
internlm · released 2024-07-03
7B
1w agoFalcon 3 7B Instruct
falcon · released 2024-12-17
7B
1w agoPaliGemma 2 10B
gemma · released 2024-12-05
10B~
1w agoPaliGemma 2 3B
gemma · released 2024-12-05
3B
1w agoGLM-4V 9B
glm · released 2024-06-04
13.9B
1w agoNemotron Mini 4B Instruct
other · released 2024-09-13
4B
1w agoSmolLM 2 1.7B Instruct
other · released 2024-11-01
1.7B
1w agoSmolLM 2 360M Instruct
other · released 2024-11-01
360M
1w agoTulu 3 70B
other · released 2024-11-21
70B~
1w agoTulu 3 8B
other · released 2024-11-21
8B~
1w agoGranite 3.0 8B Instruct
granite · released 2024-10-21
8B~
1w agoGranite 3.0 2B Instruct
granite · released 2024-10-21
2B
1w agoJamba 1.5 Large
other · released 2024-08-22
398B
1w agoJamba 1.5 Mini
other · released 2024-08-22
52B
1w agoMolmo 72B
other · released 2024-09-25
72B~
1w agoMolmo 7B-D
other · released 2024-09-25
8B~
1w agoInternVL 2.5 78B
other · released 2024-12-05
78B~
1w agoInternVL 2.5 26B
other · released 2024-12-05
26B
1w agoLLaVA-OneVision 7B
other · released 2024-08-06
7B
1w agoLLaVA 1.6 Mistral 7B
other · released 2024-01-30
7B
1w agoWhisper Large v3 Turbo
other · released 2024-10-01
810M
1w agoWhisper Large v3
other · released 2023-11-06
1.55B
1w agoCodeQwen 1.5 7B
qwen · released 2024-04-16
7B
1w agoStarCoder 2 15B
other · released 2024-02-28
15B
1w agoStarCoder 2 7B
other · released 2024-02-28
7B
1w agoStarCoder 2 3B
other · released 2024-02-28
3B
1w agoAya 23 35B
other · released 2024-05-23
35B~
1w agoAya 23 8B
other · released 2024-05-23
8B~
1w agoMistral Saba 24B
mistral · released 2025-02-17
24B~
1w agoMinistral 8B Instruct
mistral · released 2024-10-16
8B~
1w agoMinistral 3B Instruct
mistral · released 2024-10-16
3B
1w agoQwen 2-VL 7B
qwen · released 2024-08-29
7B
1w agoQwen 2.5 Math 72B
qwen · released 2024-09-19
72B~
1w agoQwen 2.5 Math 7B
qwen · released 2024-09-19
7B
1w agoQwen 2.5 Coder 3B
qwen · released 2024-11-12
3B
1w agoQwen 2.5 Coder 1.5B
qwen · released 2024-11-12
1.5B
1w agoLlama 3.3 8B Instruct
llama · released 2025-04-12
8B~
1w agoDevstral Small 2 24B
mistral · released 2025-09-25
24B~
1w agoPhind CodeLlama 34B v2
llama · released 2023-09-01
34B~
1w agoDeepSeek MoE 16B Base
deepseek · released 2024-01-15
16B
1w agoBaichuan 4 13B
baichuan · released 2024-10-30
13B
1w agoPhi-4 Reasoning Mini 4B
phi · released 2026-04-08
3.8B
1w agoAya Expanse 32B
command-r · released 2024-10-22
32B~
1w agoDeepSeek R1 Distill Mistral 24B
deepseek · released 2025-03-18
24B~
1w agoCommand R+ (Aug 2024)
command-r · released 2024-08-30
104B
Comfortable
Q4 fits with KV headroom
~
Tight
Q4 fits, small context
Marginal
IQ3 only, expect degradation
Doesn't fit
Won't run usefully
HOW FIT VERDICTS ARE DERIVED

Quick footprint estimate at Q4_K_M: params × 0.6 GB + 1.5 GB runtime overhead. Comfortable means the rig has ≥1.4× headroom for KV cache and multi-turn context. Tight means it fits but you'll bump the ceiling on long context. Marginal means only aggressive (IQ3 / IQ2) quants work, with quality degradation. Doesn't fit means weights alone won't load without RAM offload, which crushes tok/s. Frontier-class models (400B+) render as no-fit on every single-rig VRAM class — they need multi-GPU or cloud.