Hardware
79 GPUs, SoCs, and laptops covered. What runs on each, with real benchmarks.
GPUs
NVIDIA GB200 NVL72
72-GPU Blackwell rack with 36 Grace CPUs. Hyperscale-only — relevant context here for understanding 'what frontier training runs on'.
AMD Instinct MI355X
Latest CDNA 4. 288GB HBM3e — currently the highest VRAM per chip on the market.
AMD Instinct MI325X
256GB HBM3e — direct competitor to NVIDIA H200 with more memory.
NVIDIA B200
Datacenter Blackwell. 192GB HBM3e per chip, ~8 TB/s bandwidth. Cloud-tier — you rent these by the hour.
AMD Instinct MI300X
192GB HBM3 datacenter card. Used by Microsoft, Oracle, Meta cloud deployments.
NVIDIA H100 NVL
Dual-card H100 with 188GB combined memory. Built for LLM serving.
NVIDIA H200
Hopper refresh — 141GB HBM3e at ~4.8 TB/s. Datacenter-class; rentable on RunPod, Lambda, etc.
Intel Gaudi 3
Intel's enterprise AI accelerator. 128GB HBM2e. Habana stack required — limited ecosystem support.
AMD Instinct MI250X
Previous-gen CDNA 2. 128GB HBM2e. Powered the Frontier supercomputer.
NVIDIA RTX PRO 6000 Blackwell
Pro Blackwell — 96GB GDDR7 ECC. The single-card answer to 70B and 100B+ local inference.
Intel Gaudi 2
Previous-gen Habana accelerator. 96GB HBM2e.
NVIDIA H100 PCIe
PCIe Hopper. Lower power, lower bandwidth than SXM. Server-tier.
NVIDIA H100 SXM
Hopper SXM5 — 80GB HBM3 at 3.35 TB/s. The original GPU that trained GPT-4. Cloud-rentable.
NVIDIA A100 80GB SXM
Ampere datacenter flagship. 80GB HBM2e at 2 TB/s. Still common at cloud providers.
AMD Instinct MI210
64GB CDNA 2. Lower-power AMD datacenter option.
NVIDIA L40S
Ada-gen datacenter card. 48GB GDDR6 — popular at cloud GPU rentals as a budget H100 alternative.
NVIDIA L40
Original Ada datacenter. Slower than L40S. 48GB GDDR6.
NVIDIA RTX 6000 Ada Generation
Pro Ada — 48GB ECC. Pre-Blackwell workstation default.
NVIDIA A40
Ampere workstation/datacenter hybrid. 48GB GDDR6.
NVIDIA RTX A6000 (Ampere)
Ampere-gen workstation card with 48GB. Common in AI labs; used market is reasonable for 48GB at this point.
NVIDIA A100 40GB
Original A100. 40GB HBM2 at 1.55 TB/s. Trained the early generation of frontier models.
NVIDIA GeForce RTX 5090
Blackwell flagship. 32GB GDDR7 on a 512-bit bus delivers ~1.79 TB/s memory bandwidth — the new top of consumer hardware for local LLM infere
NVIDIA RTX 5000 Ada Generation
32GB workstation Ada. Mid-tier pro card.
NVIDIA GeForce RTX 5090 Mobile
Mobile Blackwell flagship. 24GB GDDR7 in a laptop is the new high-water mark.
NVIDIA L4
Inference-focused Ada datacenter card. Low-power 24GB suitable for 7B-14B serving.
AMD Radeon RX 7900 XTX
AMD's 24GB challenger to the 4090. ROCm Linux now solid for llama.cpp and vLLM. Best price-per-VRAM-GB on the new market.
NVIDIA GeForce RTX 3090 Ti
Highest-tier Ampere consumer card. Used market gold for AI: 24GB at sub-$1200 in 2026.
NVIDIA GeForce RTX 4090
The community-default high-end local-AI card from 2022 to 2025. 24GB GDDR6X at ~1 TB/s makes 70B Q4 comfortably loadable.
NVIDIA RTX A5000
24GB Ampere workstation card. Tighter power envelope than RTX 3090.
NVIDIA GeForce RTX 3090
The original 24GB CUDA value pick. Used market still strong in 2026 — many AI hobbyists run dual 3090 setups for 70B inference.
AMD Radeon RX 7900 XT
20GB RDNA 3. Cheaper alternative to XTX.
NVIDIA GeForce RTX 5070 Ti
16GB Blackwell at the upper-mid price tier. Strong 14B–32B model performance.
NVIDIA GeForce RTX 5080
Second-tier Blackwell. 16GB GDDR7, ~960 GB/s bandwidth. Fastest 16GB consumer card on the market.
AMD Radeon RX 9070
16GB RDNA 4 at sub-$600. ROCm + Vulkan supported.
NVIDIA GeForce RTX 5060 Ti 16GB
The 16GB sub-$500 sweet spot. Best value for entering local AI seriously.
AMD Radeon RX 9070 XT
RDNA 4 flagship. 16GB at $599 — best AMD value for local AI in 2026.
NVIDIA GeForce RTX 4080 Super
Refreshed 4080 with 16GB GDDR6X. Slightly behind 5080 but well-supported.
NVIDIA GeForce RTX 4070 Ti Super
16GB upgrade of the 4070 Ti. Solid mid-high pick for local AI.
AMD Radeon RX 7600 XT
Sub-$330 16GB AMD. Memory-bandwidth-limited but great VRAM-per-dollar.
AMD Radeon RX 7800 XT
16GB RDNA 3 mid-range.
NVIDIA GeForce RTX 4090 Mobile
Mobile Ada flagship. 16GB VRAM in a laptop. Premium gaming and AI laptop default.
NVIDIA GeForce RTX 4060 Ti 16GB
The poster child of 'cheap 16GB CUDA card'. Memory bandwidth is mediocre but 16GB at $400-something opens up 14B Q4.
NVIDIA GeForce RTX 3080 16GB (Mobile)
Laptop variant of Ampere. 16GB VRAM in a portable form factor was rare and remains a sleeper pick on the used market.
NVIDIA GeForce RTX 4080
Original 4080. 16GB GDDR6X. Still capable for 14B–32B Q4 work.
Intel Arc A770 16GB
Alchemist 16GB. Cheapest path to that VRAM tier. Vulkan llama.cpp is the most-tested route.
NVIDIA GeForce RTX 5070
Mid-range Blackwell with 12GB. 7B-14B Q4 territory.
NVIDIA GeForce RTX 4070 Super
Refreshed 4070. Strong mid-range value for 12GB-tier local AI.
Intel Arc B580
Battlemage architecture. 12GB at $250 — the budget compute card. IPEX-LLM and Vulkan are usable paths for AI.
NVIDIA GeForce RTX 4070
Original 4070. 12GB Ada. Now eclipsed by 4070 Super at the same price.
AMD Radeon RX 7700 XT
12GB RDNA 3.
NVIDIA GeForce RTX 4070 Ti
12GB Ada — fits 7B–14B Q4 with usable context.
NVIDIA GeForce RTX 3080 12GB
Mid-life 12GB refresh of the 3080. Decent 7B–14B card on the used market.
NVIDIA GeForce RTX 3060 12GB
The community pick for 'cheapest CUDA card with serious VRAM'. The value floor for local AI in 2026.
Intel Arc B570
10GB Battlemage at sub-$220. Entry budget compute.
NVIDIA GeForce RTX 3080 10GB
Original 10GB 3080. Tight on VRAM for AI but still capable for 7B work.
NVIDIA GeForce RTX 5060
Entry Blackwell. 8GB limits to 7B Q4 with limited context.
NVIDIA GeForce RTX 5060 Ti 8GB
8GB Blackwell. Capable of 7B Q4 only — go 16GB SKU instead for AI work.
NVIDIA GeForce RTX 4060
Entry-level Ada. 8GB limits to 7B Q4.
NVIDIA GeForce RTX 4060 Ti 8GB
8GB version — go 16GB SKU for AI work.
NVIDIA GeForce RTX 3070
8GB Ampere. Fits 7B Q4 only.
APUs
Laptops
ASUS ROG Strix Scar 18 (RTX 5090 Mobile)
Desktop-replacement gaming/AI laptop with cooler thermals than ultraslims.
Razer Blade 16 (2025, RTX 5090 Mobile)
Top-end Windows AI laptop with 24GB RTX 5090 Mobile.
Lenovo Legion 5 Pro Gen 7 (RTX 3080 16GB)
Ryzen 7 6800H + RTX 3080 16GB Mobile. The reference 'serious local-AI laptop' build. Look for the 16GB SKU.
Framework Laptop 16 (RX 7700S)
Modular AMD laptop. Limited GPU but the platform is the appeal.
MacBook Pro 16" M4 Max
16-inch M4 Max — 128GB unified at 546 GB/s. The most capable AI laptop in 2026.
Pre-built desktops
NVIDIA DGX Spark (Project Digits)
NVIDIA's desktop AI box — Grace Blackwell GB10 with 128GB unified LPDDR5X. The closest consumer can get to running 200B-class models locally
Apple Mac Studio (M3 Ultra)
Top-spec Mac Studio with M3 Ultra. Up to 512GB unified memory in custom configs.
Apple Silicon / SoCs
Apple M4 Ultra
Two-chip Ultra fusing two M4 Max dies. Up to 256GB unified memory at 1.1 TB/s. The single highest-VRAM consumer rig you can buy in a Mac Stu
Apple M3 Ultra
M3 Ultra — up to 512GB unified in Mac Studio top spec. 819 GB/s bandwidth.
Qualcomm Snapdragon X Plus
Lower-tier Snapdragon X. 45 TOPS NPU.
Qualcomm Snapdragon X Elite
Windows-on-ARM SoC with a 45 TOPS NPU. Limited LLM ecosystem in 2026 but improving via DirectML and ONNX paths.
Apple M4 Pro
Mid-tier M4 — 273 GB/s bandwidth, up to 48GB.
Apple M4 Max
M4 Max — 546 GB/s memory bandwidth, up to 128GB unified. Most capable laptop SoC for 70B+ models.
Apple M3 Max
M3 Max — 400 GB/s bandwidth, up to 128GB.
Apple M2 Ultra
M2 Ultra — up to 192GB at 800 GB/s. Mac Studio and Mac Pro hosting models.
Apple M2 Max
M2 Max — 400 GB/s bandwidth, up to 96GB.
Apple M1 Ultra
Original Ultra — 800 GB/s. 64–128GB unified. Still capable for 70B Q4.
Apple M1 Max
Original M1 Max. 400 GB/s. 32–64GB unified.