RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
  • Will it run?
GUIDES
  • Best GPU
  • Best laptop
  • Best Mac
  • Best used GPU
  • Best budget GPU
  • Best GPU for Ollama
  • Best GPU for SD
  • AI PC build $2K
  • CUDA vs ROCm
  • 16 vs 24 GB
  • Compare hardware
  • Custom compare
REF
  • Systems
  • Ecosystem maps
  • Pillar guides
  • Methodology
  • Glossary
  • Errors KB
  • Troubleshooting
  • Resources
  • Public API
EDITOR
  • About
  • About the author
  • Changelog
  • Latest
  • Updates
  • Submit benchmark
  • Send feedback
  • Trust
  • Editorial policy
  • How we make money
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned
RUNLOCALAI · v38
  1. >
  2. Home
  3. /Hardware
  4. /Hierarchy
GPU hierarchy for local AI

Every GPU ranked for local AI inference

One screen. Every catalog GPU sorted by tier, with estimated tok/s for the four canonical model sizes (7B, 14B, 32B, 70B at Q4_K_M). Measurements where we have them, bandwidth-derived estimates where we don't — every cell labeled so you know what you're reading. Methodology at /methodology.

A-tier
4
Consumer flagship
B-tier
10
Enthusiast
C-tier
18
Mid-range
D-tier
43
Budget
M-tier
7
Mobile
S-tier
26
Workstation / Datacenter
E-tier
19
Apple Silicon / Edge
TierGPUVRAMBW (GB/s)Price7B Q414B Q432B Q470B Q4Rating
AApple Mac Studio (M3 Ultra)
apple · 2025
0 GB?$4,999????10.0
AMacBook Pro 16" M4 Max
apple · 2024
0 GB?$3,999????10.0
ANVIDIA GeForce RTX 5090
nvidia · 2025
32 GB1792$2,499195●105–14846–64—9.6
ANVIDIA GeForce RTX 4090
nvidia · 2022
24 GB1008$1,899150●59–8337●8●8.8
BLenovo Legion 5 Pro Gen 7 (RTX 3080 16GB)
nvidia · 2022
16 GB?$1,499????9.3
BNVIDIA GeForce RTX 3090 Ti
nvidia · 2022
24 GB?$1,199????8.8
BAMD Radeon RX 7900 XTX
amd · 2022
24 GB960$89986●56–7925–34—8.6
BNVIDIA GeForce RTX 3090
nvidia · 2020
24 GB?$899105●???8.5
BNVIDIA GeForce RTX 5080
nvidia · 2025
16 GB960$1,199132●56–79——8.1
BNVIDIA GeForce RTX 4080 Super
nvidia · 2024
16 GB736$1,09982–11443–61——8.1
BNVIDIA GeForce RTX 4070 Ti Super
nvidia · 2024
16 GB?$829????8.1
BNVIDIA GeForce RTX 5070 Ti
nvidia · 2025
16 GB?$849????8.1
BNVIDIA GeForce RTX 4080
nvidia · 2022
16 GB?$1,099????7.8
BNVIDIA GeForce RTX 3080 Ti
nvidia · 2021
12 GB912$480101–14254–75——7.3
CAMD Radeon RX 7900 XT
amd · 2022
20 GB?$729????8.1
CNVIDIA GeForce RTX 5060 Ti 16GB
nvidia · 2025
16 GB?$459????8.1
CAMD Radeon RX 7900 GRE
amd · 2024
16 GB576$54964–9034–47——7.9
CAMD Radeon RX 9070
amd · 2025
16 GB?$569????7.9
CAMD Radeon RX 9070 XT
amd · 2025
16 GB?$649????7.9
CNVIDIA GeForce RTX 4060 Ti 16GB
nvidia · 2023
16 GB?$449?28●??7.8
CAMD Radeon RX 6950 XT
amd · 2022
16 GB576$58064–9034–47——7.6
CAMD Radeon RX 7800 XT
amd · 2023
16 GB?$459????7.6
CNVIDIA GeForce RTX 4070 Super
nvidia · 2024
12 GB?$619????7.6
CNVIDIA GeForce RTX 5070
nvidia · 2025
12 GB?$599????7.6
CAMD Radeon RX 6800
amd · 2020
16 GB512$38057–8030–42——7.3
CAMD Radeon RX 6800 XT
amd · 2020
16 GB512$45057–8030–42——7.3
CAMD Radeon RX 6900 XT
amd · 2020
16 GB512$50057–8030–42——7.3
CNVIDIA GeForce RTX 3080 12GB
nvidia · 2022
12 GB?$449????7.3
CNVIDIA GeForce RTX 4070
nvidia · 2023
12 GB?$549????7.3
CNVIDIA GeForce RTX 4070 Ti
nvidia · 2023
12 GB?$749????7.3
CNVIDIA GeForce RTX 2080 Ti
nvidia · 2018
11 GB616$38068–9636–51——6.6
CNVIDIA GeForce RTX 3070 Ti
nvidia · 2021
8 GB608$35068–95———5.0
DAMD Radeon RX 7600 XT
amd · 2024
16 GB?$309????7.9
DAMD Radeon RX 6750 XT
amd · 2022
12 GB432$32048–6725–36——7.1
DAMD Radeon RX 7700 XT
amd · 2023
12 GB?$379????7.1
DNVIDIA GeForce RTX 3060 12GB
nvidia · 2021
12 GB360$24940–5621–30——7.0
DAMD Radeon RX 6700 XT
amd · 2021
12 GB384$28043–6023–32——6.8
DNVIDIA GeForce GTX 1080 Ti
nvidia · 2017
11 GB484$25054–7528–40——6.6
DIntel Arc A770 16GB
intel · 2022
16 GB?$269????6.5
DNVIDIA GeForce RTX 3080 10GB
nvidia · 2020
10 GB?$379????6.5
DIntel Arc B580
intel · 2024
12 GB?$269????6.3
DIntel Arc B570
intel · 2025
10 GB?$219????5.8
DNVIDIA GeForce RTX 5060
nvidia · 2025
8 GB?$299????5.6
DNVIDIA GeForce RTX 5060 Ti 8GB
nvidia · 2025
8 GB?$379????5.6
DNVIDIA GeForce RTX 3050
nvidia · 2022
8 GB224$20025–35———5.3
DNVIDIA GeForce RTX 4060
nvidia · 2023
8 GB?$279????5.3
DNVIDIA GeForce RTX 4060 Ti 8GB
nvidia · 2023
8 GB?$369????5.3
DNVIDIA GeForce RTX 2080 Super
nvidia · 2019
8 GB496$32055–77———5.1
DNVIDIA GeForce RTX 2070
nvidia · 2018
8 GB448$24050–70———5.1
DAMD Radeon RX 6650 XT
amd · 2022
8 GB280$23031–44———5.1
DNVIDIA GeForce GTX 1070 Ti
nvidia · 2017
8 GB256$16028–40———5.1
DNVIDIA GeForce RTX 3060 Ti
nvidia · 2020
8 GB448$28050–70———5.0
DNVIDIA GeForce RTX 3070
nvidia · 2020
8 GB?$269????5.0
DNVIDIA GeForce RTX 2060 Super
nvidia · 2019
8 GB448$22050–70———4.8
DNVIDIA GeForce RTX 2070 Super
nvidia · 2019
8 GB448$28050–70———4.8
DAMD Radeon RX 6600 XT
amd · 2021
8 GB256$20028–40———4.8
DAMD Radeon RX 6600
amd · 2021
8 GB224$18025–35———4.8
DNVIDIA GeForce GTX 1080
nvidia · 2016
8 GB320$18036–50———4.6
DNVIDIA GeForce GTX 1070
nvidia · 2016
8 GB256$14028–40———4.6
DAMD Radeon RX 580 8GB
amd · 2017
8 GB256$8028–40———3.8
DAMD Radeon RX 5700 XT
amd · 2019
8 GB448$20050–70———3.5
DAMD Radeon RX 5500 XT 8GB
amd · 2019
8 GB224$11025–35———3.5
DNVIDIA GeForce GTX 1660 Super
nvidia · 2019
6 GB336$15037–52———2.8
DNVIDIA GeForce RTX 2060
nvidia · 2019
6 GB336$18037–52———2.8
DNVIDIA GeForce GTX 1660 Ti
nvidia · 2019
6 GB288$16032–45———2.8
DNVIDIA GeForce GTX 1660
nvidia · 2019
6 GB192$13021–30———2.8
DNVIDIA GeForce GTX 1060 6GB
nvidia · 2016
6 GB192$11021–30———2.6
DAMD Radeon 880M (Strix Point iGPU)
amd · 2024
0 GB102?????2.4
DAMD Radeon 780M (Phoenix iGPU)
amd · 2023
0 GB89?????2.1
DNVIDIA GeForce GTX 1650 Super
nvidia · 2019
4 GB192$140————1.8
DNVIDIA GeForce GTX 1650
nvidia · 2019
4 GB128$130————1.8
DAMD Radeon RX 5600 XT
amd · 2020
6 GB336$14037–52———1.7
DNVIDIA GeForce GTX 1050 Ti
nvidia · 2016
4 GB112$90————1.3
DNVIDIA GeForce GTX 1060 3GB
nvidia · 2016
3 GB192$70————1.1
DAMD Radeon RX 570
amd · 2017
4 GB224$60————1.0
MASUS ROG Strix Scar 18 (RTX 5090 Mobile)
nvidia · 2025
24 GB?$3,999????9.6
MRazer Blade 16 (2025, RTX 5090 Mobile)
nvidia · 2025
24 GB?$4,499????9.6
MFramework Laptop 16 (RX 7700S)
amd · 2024
8 GB?$1,699????8.9
MNVIDIA GeForce RTX 3080 16GB (Mobile)
nvidia · 2022
16 GB512?79●30–42——8.8
MNVIDIA GeForce RTX 5090 Mobile
nvidia · 2025
24 GB??????8.6
MNVIDIA GeForce RTX 4090 Mobile
nvidia · 2023
16 GB??????7.3
MNVIDIA GeForce RTX 3050 Ti (Mobile)
nvidia · 2021
4 GB192?————1.5
SAMD Instinct MI300A (APU)
amd · 2023
128 GB??????10.0
SAMD Instinct MI300X
amd · 2023
192 GB?$15,000????10.0
SAMD Instinct MI325X
amd · 2024
256 GB?$20,000????10.0
SAMD Instinct MI355X
amd · 2025
288 GB?$25,000????10.0
SNVIDIA B200
nvidia · 2024
192 GB?$40,000????10.0
SNVIDIA DGX Spark (Project Digits)
nvidia · 2025
0 GB?$3,000????10.0
SNVIDIA GB200 NVL72
nvidia · 2024
13824 GB??????10.0
SNVIDIA H100 NVL
nvidia · 2023
188 GB?$60,000????10.0
SNVIDIA H100 PCIe
nvidia · 2022
80 GB?$25,000????10.0
SNVIDIA H100 SXM
nvidia · 2022
80 GB?$30,000????10.0
SNVIDIA H200
nvidia · 2024
141 GB?$31,000????10.0
SNVIDIA L40
nvidia · 2022
48 GB?$8,000????10.0
SNVIDIA L40S
nvidia · 2023
48 GB?$8,500????10.0
SNVIDIA RTX 6000 Ada Generation
nvidia · 2022
48 GB?$6,499????10.0
SNVIDIA RTX PRO 6000 Blackwell
nvidia · 2025
96 GB?$8,999????10.0
SAMD Instinct MI210
amd · 2022
64 GB?$8,500????9.8
SAMD Instinct MI250X
amd · 2021
128 GB?$13,000????9.7
SNVIDIA A100 80GB SXM
nvidia · 2020
80 GB?$17,000????9.7
SNVIDIA A40
nvidia · 2020
48 GB?$5,500????9.7
SNVIDIA RTX A6000 (Ampere)
nvidia · 2020
48 GB?$3,500????9.7
SNVIDIA RTX 5000 Ada Generation
nvidia · 2023
32 GB?$4,000????9.5
SNVIDIA A100 40GB
nvidia · 2020
40 GB?$11,000????9.2
SNVIDIA L4
nvidia · 2023
24 GB?$2,500????9.0
SNVIDIA RTX A5000
nvidia · 2021
24 GB?$2,500????8.7
SIntel Gaudi 3
intel · 2024
128 GB?$18,000????8.2
SIntel Gaudi 2
intel · 2022
96 GB?$8,000????7.9
EApple M3 Ultra
apple · 2025
0 GB?????12●10.0
EApple M4 Max
apple · 2024
0 GB??79●???10.0
EApple M4 Pro
apple · 2024
0 GB??????10.0
EApple M4 Ultra
apple · 2025
0 GB??????10.0
EApple M1 Ultra
apple · 2022
0 GB??????9.9
EApple M2 Ultra
apple · 2023
0 GB??????9.9
EApple M3 Max
apple · 2023
0 GB??55●???9.9
EApple M2 Max
apple · 2023
0 GB??????9.7
EApple M1 Max
apple · 2021
0 GB??????8.9
EQualcomm Snapdragon X Elite
qualcomm · 2024
0 GB??????7.3
EQualcomm Snapdragon X Plus
qualcomm · 2024
0 GB??????5.8
EQualcomm Snapdragon 8 Elite
qualcomm · 2024
0 GB??????5.3
EApple A18 Pro
apple · 2024
0 GB??????5.0
EApple M4 (iPad Pro)
apple · 2024
0 GB??????5.0
EGoogle Tensor G4
google · 2024
0 GB??????4.8
EApple A17 Pro
apple · 2023
0 GB??????4.7
EQualcomm Snapdragon 8 Gen 3
qualcomm · 2023
0 GB??????4.5
EAMD Ryzen AI 9 HX 370 (Strix Point)
amd · 2024
0 GB?$1,599????3.9
EIntel Core Ultra 7 258V (Lunar Lake)
intel · 2024
0 GB?$1,199????3.8
How to read this table
79●Measured tok/s (operator or community)130–180Estimated from bandwidth (50-70% efficiency)—Model doesn't fit at this card's VRAM

Estimates use the formula tok/s ≈ memory_bandwidth_GBps ÷ model_weights_GB × efficiency — the dominant constraint for autoregressive decode. The 50-70% efficiency band reflects realistic Ollama / llama.cpp / vLLM runtime overhead. See /methodology for the full derivation.

Got a rig? Run a benchmark and turn an estimate into a measured cell. Every measurement improves the table for the next reader.

Best value per model tier

Lowest $/tok-s pick for each model size

For each model tier, the catalog card with the lowest cost-per-tok/s among cards that fit. Computed from current street price ÷ estimated tok/s midpoint. A pick changing here is the live signal that prices or new hardware shifted the value frontier.

7B Q4
AMD Radeon RX 580 8GB
$80$2.34/tok·s
amd · 8GB · 256 GB/s
14B Q4
NVIDIA GeForce GTX 1080 Ti
$250$7.32/tok·s
nvidia · 11GB · 484 GB/s
32B Q4
AMD Radeon RX 7900 XTX
$899$30.43/tok·s
amd · 24GB · 960 GB/s
70B Q4
No card in catalog with both price + bandwidth data

Caveat: $/tok-s is a derived estimate stacked on the bandwidth formula. For workloads where you have a measured benchmark in the table above, trust the measured number first; for unmeasured combinations, this is the ranked best-guess for buyer decisions.

Choosing a GPU for your workload

The hierarchy answers "which is fastest" — but the right card for you depends on which model size you actually want to run. The four most common operator decisions:

  • 7B Q4 (autocomplete, single-model chat) — any card with ≥6 GB VRAM works. The decision shifts to price + power draw + cross-vendor preference. Top D-tier cards (Arc B580, RTX 3060) deliver useful tok/s at <$300.
  • 14B Q4 (coding assistant, mid-size chat) — ≥11 GB VRAM minimum. C-tier (RTX 4070 / RX 7800 XT) is the value sweet spot at $400-600.
  • 32B Q4 (full coding agent, multi-model) — ≥22 GB VRAM. B-tier 24 GB cards are the canonical buy: RTX 3090 used, RX 7900 XTX new, RTX 4090 if budget allows.
  • 70B Q4 (frontier-class local) — ≥48 GB VRAM. Single-card: RTX 6000 Ada / L40S / Mac Studio M3 Ultra. Multi-card: dual 3090 / dual 4090. Workstation tier or above.

Need it more personalized? Use /choose-my-gpu for a 9-input recommender, or /will-it-run to validate a specific model + GPU combination.