Cloud provider Claude Opus 4.7 ($15/75/M) Claude Sonnet 4.5 ($3/15/M) GPT-5 ($5/20/M) Gemini 2.5 Pro ($1.25/10/M) Claude Haiku 4 ($1/5/M) GPT-5 mini ($0.5/2/M) DeepSeek V3 (API) ($0.27/1.1/M) Llama 3.3 70B (Together) ($0.88/0.88/M) Llama 3.3 70B (Groq) ($0.59/0.79/M) Qwen 3 32B (Together) ($0.6/0.6/M) Qwen 2.5 Coder 32B (DeepInfra) ($0.18/0.18/M)
Horizon 6 months 1 year 2 years 3 years (typical amortization) 5 years
Your local hardware — pick hardware — NVIDIA GB200 NVL72 (13824 GB) AMD Instinct MI355X (288 GB) AMD Instinct MI325X (256 GB) NVIDIA B200 (192 GB) AMD Instinct MI300X (192 GB) NVIDIA H100 NVL (188 GB) NVIDIA H200 NVL (PCIe) (141 GB) — $32000 NVIDIA H200 (141 GB) Intel Gaudi 3 (128 GB) AMD Instinct MI300A (APU) (128 GB) AMD Instinct MI250X (128 GB) Intel Gaudi 2 (96 GB) NVIDIA RTX PRO 6000 Blackwell (96 GB) — $8999 NVIDIA A100 80GB SXM (80 GB) NVIDIA H100 SXM (80 GB) NVIDIA H100 PCIe (80 GB) AMD Instinct MI210 (64 GB) NVIDIA RTX 4090 48GB (China-mod) (48 GB) — $2400 NVIDIA RTX 5000 PRO Blackwell 48GB (48 GB) — $5499 NVIDIA RTX A6000 (Ampere) (48 GB) — $3500 NVIDIA RTX 6000 Ada Generation (48 GB) — $6499 NVIDIA L40 (48 GB) NVIDIA L40S (48 GB) NVIDIA A40 (48 GB) NVIDIA A100 40GB (40 GB) NVIDIA GeForce RTX 5090 (32 GB) — $2499 NVIDIA RTX 5000 Ada Generation (32 GB) NVIDIA GeForce RTX 5090 Mobile (24 GB) NVIDIA GeForce RTX 3090 Ti (24 GB) — $1199 NVIDIA L4 (24 GB) ASUS ROG Strix Scar 18 (RTX 5090 Mobile) (24 GB) AMD Radeon RX 7900 XTX (24 GB) — $899 NVIDIA GeForce RTX 4090 (24 GB) — $1899 NVIDIA RTX A5000 (24 GB) Razer Blade 16 (2025, RTX 5090 Mobile) (24 GB) NVIDIA GeForce RTX 3090 (24 GB) — $899 NVIDIA RTX 2080 Ti 22GB (China-mod) (22 GB) — $350 AMD Radeon RX 7900 XT (20 GB) — $729 NVIDIA GeForce RTX 4080 (16 GB) — $1099 NVIDIA GeForce RTX 5060 Ti 16GB (16 GB) — $459 AMD Radeon RX 7800 XT (16 GB) — $459 AMD Radeon RX 7900 GRE (16 GB) — $549 Intel Arc A770 16GB (16 GB) — $269 Lenovo Legion 5 Pro Gen 7 (RTX 3080 16GB) (16 GB) — $1499 NVIDIA GeForce RTX 4080 Super (16 GB) — $1099 NVIDIA GeForce RTX 3080 16GB (Mobile) (16 GB) AMD Radeon RX 6900 XT (16 GB) — $500 AMD Radeon RX 9070 XT (16 GB) — $649 AMD Radeon RX 9060 XT (16 GB) — $449 NVIDIA GeForce RTX 5080 (16 GB) — $1199 NVIDIA GeForce RTX 4060 Ti 16GB (16 GB) — $449 NVIDIA GeForce RTX 4090 Mobile (16 GB) AMD Radeon RX 6800 XT (16 GB) — $450 AMD Radeon RX 6800 (16 GB) — $380 AMD Radeon RX 6950 XT (16 GB) — $580 NVIDIA GeForce RTX 4070 Ti Super (16 GB) — $829 NVIDIA GeForce RTX 5070 Ti (16 GB) — $849 AMD Radeon RX 7600 XT (16 GB) — $309 AMD Radeon RX 9070 (16 GB) — $569 NVIDIA GeForce RTX 4070 (12 GB) — $549 AMD Radeon RX 7700 XT (12 GB) — $379 NVIDIA GeForce RTX 3080 Ti (12 GB) — $480 Intel Arc B580 (12 GB) — $269 NVIDIA GeForce RTX 4070 Ti (12 GB) — $749 AMD Radeon RX 6700 XT (12 GB) — $280 AMD Radeon RX 6750 XT (12 GB) — $320 NVIDIA GeForce RTX 3060 12GB (12 GB) — $249 NVIDIA GeForce RTX 4070 Super (12 GB) — $619 NVIDIA GeForce RTX 5070 (12 GB) — $599 NVIDIA GeForce RTX 3080 12GB (12 GB) — $449 NVIDIA GeForce GTX 1080 Ti (11 GB) — $250 NVIDIA GeForce RTX 2080 Ti (11 GB) — $380 Intel Arc B570 (10 GB) NVIDIA GeForce RTX 3080 10GB (10 GB) — $379 NVIDIA GeForce RTX 4060 (8 GB) — $279 NVIDIA GeForce RTX 5060 (8 GB) — $299 NVIDIA GeForce RTX 4060 Ti 8GB (8 GB) — $369 Framework Laptop 16 (RX 7700S) (8 GB) NVIDIA GeForce RTX 2070 Super (8 GB) — $280 AMD Radeon RX 6650 XT (8 GB) — $230 NVIDIA GeForce RTX 3070 Ti (8 GB) — $350 AMD Radeon RX 6600 XT (8 GB) — $200 AMD Radeon RX 6600 (8 GB) — $180 AMD Radeon RX 580 8GB (8 GB) — $80 NVIDIA GeForce RTX 3070 (8 GB) — $269 NVIDIA GeForce RTX 5060 Ti 8GB (8 GB) — $379 NVIDIA GeForce GTX 1070 (8 GB) — $140 NVIDIA GeForce GTX 1080 (8 GB) — $180 NVIDIA GeForce GTX 1070 Ti (8 GB) — $160 NVIDIA GeForce RTX 2060 Super (8 GB) — $220 NVIDIA GeForce RTX 2070 (8 GB) — $240 NVIDIA GeForce RTX 3060 Ti (8 GB) — $280 NVIDIA GeForce RTX 3050 (8 GB) — $200 AMD Radeon RX 5500 XT 8GB (8 GB) — $110 AMD Radeon RX 5700 XT (8 GB) — $200 NVIDIA GeForce RTX 2080 Super (8 GB) — $320 NVIDIA GeForce GTX 1060 6GB (6 GB) — $110 NVIDIA GeForce GTX 1660 Super (6 GB) — $150 NVIDIA GeForce RTX 2060 (6 GB) — $180 AMD Radeon RX 5600 XT (6 GB) — $140 NVIDIA GeForce GTX 1660 (6 GB) — $130 NVIDIA GeForce GTX 1660 Ti (6 GB) — $160 NVIDIA GeForce GTX 1650 (4 GB) — $130 AMD Radeon RX 570 (4 GB) — $60 NVIDIA GeForce GTX 1050 Ti (4 GB) — $90 NVIDIA GeForce RTX 3050 Ti (Mobile) (4 GB) NVIDIA GeForce GTX 1650 Super (4 GB) — $140 NVIDIA GeForce GTX 1060 3GB (3 GB) — $70
Local model (Q4_K_M assumed) — pick model — DeepSeek V4 Pro (1.6T MoE) (1600.0B) Qwen 3.5 235B-A17B (MoE) (397.0B) Qwen 3 235B-A22B (235.0B) DeepSeek R1 (671B reasoning) (671.0B) DeepSeek V4 Flash (284B MoE) (284.0B) Llama 3.1 8B Instruct (8.0B) Llama 4 Scout (109.0B) Qwen 3 30B-A3B (30.0B) Llama 3.3 70B Instruct (70.0B) Qwen 2.5 Coder 32B Instruct (32.0B) Gemma 4 31B Dense (31.0B) Qwen 3 32B (32.0B) Qwen 3 8B (8.0B) DeepSeek R1 Distill Llama 70B (70.0B) Mistral Medium 3.5 (675B MoE) (675.0B) DeepSeek R1 Distill Qwen 32B (32.0B) GLM-5 (200.0B) DeepSeek V3 (671B MoE) (671.0B) Gemma 4 26B MoE (26.0B) Llama 3.2 3B Instruct (3.0B) Qwen 3 14B (14.0B) Mistral Small 3 24B (24.0B) Nemotron 3 Nano (30B-A3B) (30.0B) Qwen 2.5 7B Instruct (7.0B) DeepSeek R1 Distill Qwen 7B (7.0B) Phi-4 14B (14.0B) Qwen 2.5 14B Instruct (14.0B) Gemma 3 27B (27.0B) Hermes 3 Llama 3.1 8B (8.0B) Llama 3.1 70B Instruct (70.0B) Qwen 3.6 35B-A3B (MTP) (35.0B) Kimi K2.6 (1000.0B) Mistral Nemo 12B Instruct (12.0B) Phi-4 Reasoning 14B (14.0B) Qwen 2.5 32B Instruct (32.0B) DeepSeek R1 Distill Qwen 14B (14.0B) Llama 3.1 Nemotron 70B Instruct (70.0B) QwQ 32B Preview (32.0B) Gemma 4 E4B (Effective 4B) (4.0B) Llama 3.2 11B Vision Instruct (11.0B) Gemma 3 12B (12.0B) Nemotron 3 Super (120B-A12B) (120.0B) Qwen 2.5 72B Instruct (72.0B) Qwen 3 4B (4.0B) OLMo 2 32B (32.0B) Phi-3.5 Mini Instruct (3.8B) DeepSeek Coder V2 Lite (16B) (16.0B) Hermes 3 Llama 3.1 70B (70.0B) Llama 3.1 Nemotron Ultra 253B (253.0B) Llama 4 Maverick (400.0B) Pixtral 12B (12.0B) Qwen 3.6 27B (MTP) (27.0B) Codestral 22B (22.0B) Llama 3.1 Nemotron Nano 8B (8.0B) Llama 3.2 1B Instruct (1.0B) Gemma 2 9B Instruct (9.0B) Gemma 3 4B (4.0B) Mistral 7B Instruct v0.3 (7.0B) Mistral Large 2 (123B) (123.0B) Gemma 4 E2B (Effective 2B) (2.0B) Dolphin 3.0 Mistral 24B (24.0B) Llama 3.2 90B Vision Instruct (90.0B) Mixtral 8x7B Instruct (47.0B) Command R+ 104B (104.0B) Phi-3.5 Vision (4.2B) Ring-2.6-1T (1000.0B) Command R 35B (35.0B) Mixtral 8x22B Instruct (141.0B) Gemma 3 1B (1.0B) WizardLM-2 8x22B (141.0B) Yi 1.5 34B (34.0B) MedGemma 27B (27.0B) CodeGemma 7B (7.0B) Aya 23 35B (35.0B) Aya 23 8B (8.0B) Aya Expanse 32B (32.0B) Baichuan 4 13B (13.0B) BGE M3 (0.6B) BGE Reranker v2 M3 (0.6B) CodeQwen 1.5 7B (7.0B) Codestral Mamba 7B (7.0B) Command R+ (Aug 2024) (104.0B) DBRX Base (132.0B) DBRX Instruct (132.0B) DeepSeek Coder V2 236B (236.0B) DeepSeek Coder V3 (33.0B) DeepSeek MoE 16B Base (16.0B) DeepSeek R1 Distill Llama 8B (8.0B) DeepSeek R1 Distill Mistral 24B (24.0B) DeepSeek R1 Distill Qwen 1.5B (1.5B) DeepSeek R1 Distill Qwen 3 32B (32.0B) DeepSeek V2.5 236B (236.0B) DeepSeek V3 Lite (16B MoE) (16.0B) DeepSeek V4 (745.0B) Devstral Small 2 24B (24.0B) Dolphin 3 Llama 3.3 70B (70.0B) Dolphin 3.0 Llama 3.2 3B (3.0B) EVA Llama 3.3 70B (70.0B) EXAONE 3.5 2.4B (2.4B) EXAONE 3.5 32B (32.0B) EXAONE 3.5 8B (7.8B) Falcon 3 10B (10.0B) Falcon 3 7B Instruct (7.0B) Falcon Mamba 7B (7.0B) GLM-4 9B (9.0B) GLM-4V 9B (13.9B) GLM-5 Pro (144.0B) Granite 3 MoE (3B active) (16.0B) Granite 3.0 2B Instruct (2.0B) Granite 3.0 8B Instruct (8.0B) Granite 3.2 8B (8.0B) Granite 3.3 8B (8.0B) Hermes 3 Llama 3.2 3B (3.0B) Hermes 4 Llama 3.3 70B (70.0B) Hunyuan Large 389B MoE (389.0B) InternLM 2.5 7B Chat (7.0B) InternLM 3 8B (8.0B) InternVL 2.5 26B (26.0B) InternVL 2.5 78B (78.0B) Jamba 1.5 Large (398.0B) Jamba 1.5 Mini (52.0B) Janus-Pro 7B (7.0B) Kimi K1.5 (200.0B) Llama 3.2 11B Vision (11.0B) Llama 3.2 90B Vision (90.0B) Llama 3.3 8B Instruct (8.0B) Llama 4 405B (405.0B) Llama 4 70B (70.0B) LLaVA 1.6 Mistral 7B (7.0B) LLaVA-OneVision 7B (7.0B) Magistral 32B (32.0B) MiniCPM 3 4B (4.0B) MiniCPM-V 2.6 8B (8.0B) MiniCPM-V 3 8B (8.0B) Ministral 3B Instruct (3.0B) Ministral 8B Instruct (8.0B) Mistral Medium 3 24B (dense) (24.0B) Mistral Saba 24B (24.0B) Mistral Small 3.2 24B (24.0B) Molmo 72B (72.0B) Molmo 7B-D (8.0B) Moondream 2 (1.9B) Nemotron 3 Nano 9B (9.0B) Nemotron 3 Super 49B (49.0B) Nemotron Mini 4B Instruct (4.0B) NV-Embed v2 (7.8B) OLMo 2 13B (13.0B) OpenBioLLM Llama 3 70B (70.0B) OpenCoder 8B (8.0B) PaliGemma 2 10B (10.0B) PaliGemma 2 3B (3.0B) Phi-4 Mini 4B (3.8B) Phi-4 Multimodal (14.0B) Phi-4 Reasoning Mini 4B (3.8B) Phind CodeLlama 34B v2 (34.0B) Qwen 2-VL 7B (7.0B) Qwen 2.5 0.5B Instruct (0.5B) Qwen 2.5 1.5B Instruct (1.5B) Qwen 2.5 3B Instruct (3.0B) Qwen 2.5 Coder 1.5B (1.5B) Qwen 2.5 Coder 14B Instruct (14.0B) Qwen 2.5 Coder 3B (3.0B) Qwen 2.5 Coder 7B Instruct (7.0B) Qwen 2.5 Math 72B (72.0B) Qwen 2.5 Math 7B (7.0B) Qwen 2.5-VL 3B (3.0B) Qwen 2.5-VL 72B (72.0B) Qwen 2.5-VL 7B (7.0B) Qwen 3 72B (72.0B) Qwen 3 7B (7.0B) Qwen 3 Coder 32B (32.0B) Qwen 3 Embedding 8B (8.0B) RWKV 7 'Goose' 1.5B (1.5B) SmolLM 2 1.7B Instruct (1.7B) SmolLM 2 360M Instruct (0.4B) SmolLM 3 3B (3.0B) Stable LM 2 12B (12.0B) StarCoder 2 15B (15.0B) StarCoder 2 3B (3.0B) StarCoder 2 7B (7.0B) Step-3 (1000.0B) Tulu 3 70B (70.0B) Tulu 3 8B (8.0B) Whisper Large v3 (1.6B) Whisper Large v3 Turbo (0.8B) Yi Coder 9B (9.0B)
§ Verdict at 500,000 tokens/day
Pick a local hardware + model to see the crossover analysis.