BLK · COMPARE · MODELS

Compare models

9 curated head-to-head editorial verdicts. Each page renders a 10-dimension comparison matrix from the live catalog + a use-case-weighted overall verdict.

HOW THIS WORKS

These pages target high-search-intent “X vs Y” queries with operator-grade comparisons. For arbitrary pairings — pick any two models + see fit across every common hardware tier — use /compare/models/custom (URL-shareable, will-it-run table built in). Or the interactive form-driven /model-battle. For hardware comparisons, see /compare/hardware.

CUSTOM · PICK YOUR OWN

Compare any two models + see fit across every hardware tier

The standout: a will-it-run table that shows where each model fits across 8 common tiers — from RTX 3060 12 GB to Mac Studio M3 Ultra 192 GB.

→

USE CASE · CHAT

Chat / daily driver

4 comparisons

Llama 3.3 70B vs Qwen 3 32B — the size-vs-architecture tradeoff
Llama 3.3 70B Instruct vs Qwen 3 32B: when the larger model wins on quality vs when the smaller, faster, newer model wins on throughput-per-VRAM.
Qwen 3 30B-A3B vs Qwen 3 32B — MoE speed vs dense quality at the same size
Qwen 3 30B-A3B (MoE, ~3B active) vs Qwen 3 32B (dense) — the size class is identical, the architecture isn't. When MoE wins on speed, when dense wins on quality.
Llama 3.1 8B vs Qwen 3 8B — the consumer-GPU default question
Llama 3.1 8B Instruct vs Qwen 3 8B for local AI on consumer GPUs (8-16 GB VRAM). Quality, speed, license, multimodal, and which one to run as your daily driver.
Llama 3.2 3B vs Qwen 2.5 7B — the 8 GB VRAM ceiling question
Llama 3.2 3B Instruct vs Qwen 2.5 7B Instruct for 8 GB / Jetson-class hardware. When 3B is enough, when 7B is worth the squeeze.

USE CASE · CODING

Coding agents

2 comparisons

Qwen 2.5 Coder 32B vs DeepSeek R1 Distill Qwen 32B — which 32B for local coding?
Qwen 2.5 Coder 32B vs DeepSeek R1 Distill Qwen 32B for local coding: editorial rating, VRAM fit, reasoning style, license, and which one to run on a 24 GB card.
Qwen 2.5 Coder 32B vs Qwen 3 32B — should you switch to the new generation?
Qwen 2.5 Coder 32B vs Qwen 3 32B for local coding: when the newer general-purpose model beats the older code-specialized one, and when it doesn't.

USE CASE · REASONING

Reasoning + math

3 comparisons

DeepSeek R1 Distill Llama 70B vs Llama 3.3 70B — reasoning vs instruction following
DeepSeek R1 Distill Llama 70B vs Llama 3.3 70B Instruct: the reasoning fine-tune vs the strong-instruction-following base. When each one wins, and what runs on 48 GB.
DeepSeek V3 vs Qwen 3 235B-A22B — flagship MoE showdown
DeepSeek V3 (671B-A37B) vs Qwen 3 235B-A22B (235B-A22B) — flagship open-weight MoE models. Hardware fit, quality, license, and what it actually takes to run them locally.
DeepSeek R1 vs DeepSeek R1 Distill Qwen 32B — frontier vs local-capable reasoning
DeepSeek R1 (671B) vs DeepSeek R1 Distill Qwen 32B (32B) — when the full reasoning model is worth the hardware, and what the distill captures vs loses.