RUNLOCALAIv38
→WILL IT RUNBEST GPUCOMPARETROUBLESHOOTSTARTPULSEMODELSHARDWARETOOLSBENCH
RUNLOCALAI

Operator-grade instrument for local-AI hardware intelligence. Hand-written verdicts. Real benchmarks. Reproducible commands.

OP·Fredoline Eruo
DIR
  • Models
  • Hardware
  • Tools
  • Benchmarks
  • Will it run?
GUIDES
  • Best GPU
  • Best laptop
  • Best Mac
  • Best used GPU
  • Best budget GPU
  • Best GPU for Ollama
  • Best GPU for SD
  • AI PC build $2K
  • CUDA vs ROCm
  • 16 vs 24 GB
  • Compare hardware
  • Custom compare
REF
  • Systems
  • Ecosystem maps
  • Pillar guides
  • Methodology
  • Glossary
  • Errors KB
  • Troubleshooting
  • Resources
  • Public API
EDITOR
  • About
  • About the author
  • Changelog
  • Latest
  • Updates
  • Submit benchmark
  • Send feedback
  • Trust
  • Editorial policy
  • How we make money
  • Contact
LEGAL
  • Privacy
  • Terms
  • Sitemap
MAIL · MONTHLY DIGEST
Get monthly local AI changes
Monthly recap. No spam.
DISCLOSURE

Some links on this site are affiliate links (Amazon Associates and other first-class retailers). When you buy through them, we earn a small commission at no extra cost to you. Affiliate links do not influence our verdicts — there are cards we rate highly that we don't have affiliate relationships with, and cards that sell well that we refuse to recommend. Read more →

SYS · ONLINEUPTIME · 100%2026 · operator-owned
RUNLOCALAI · v38
Glossary / Frameworks & tools / MPS (Metal Performance Shaders)
Frameworks & tools

MPS (Metal Performance Shaders)

MPS is Apple's high-level Metal-based compute library, exposed in PyTorch as the mps device backend. Calling model.to("mps") runs on the Apple Silicon GPU through MPS kernels.

MPS is workable for inference of small models but historically incomplete: many ops fall back to CPU, FP16 is supported but BF16 is not on older silicon, and large allocations sometimes hit RuntimeError: MPS backend out of memory even with available unified memory due to the 80% allocation limit.

For local LLM inference, llama.cpp's native Metal kernels and MLX-LM both outperform PyTorch MPS by 1.5–3×. Use MPS for quick PyTorch experiments; use llama.cpp or MLX for production.

Related terms

Metal (Apple)MLX (Apple)

See also

hardware: apple-m4-maxtool: llama-cpptool: mlx-lm

Reviewed by Fredoline Eruo. See our editorial policy.

Buyer guides
  • Best Mac for local AI →
  • Best budget Mac →
When it doesn't work
  • MLX out of memory →
  • MPS fallback to CPU →
  • llama.cpp Metal crash →
Compare hardware
  • M4 Max vs RTX 4090 →
  • Mac Studio vs Windows AI PC →
Hardware
  • Apple M4 Max →