Edge AI
AI at the edge — IoT cameras, industrial sensors, retail kiosks. Mid-tier between mobile SoCs and datacenter.
Setup walkthrough
- Edge AI sits between TinyML (microcontrollers, <1W) and datacenter (>100W). Typical hardware: NVIDIA Jetson family, Google Coral, Hailo-8, Intel Movidius, or x86 edge servers.
- For a quick start with x86 edge: use any mini PC (Intel NUC, ~$300-500) + Ollama. Install Ubuntu →
ollama pull llama3.2:3b→ run inference at 15-30 tok/s on CPU. - For computer vision at the edge:
pip install ultralytics→ YOLO11n on a mini PC with integrated GPU runs 15-30 fps for 1080p video. - For industrial edge (factory floor, retail kiosk): use an edge gateway that collects sensor data + runs local inference + only sends aggregated results to cloud.
- Typical edge stack: MQTT (sensor data ingestion) → local AI model (anomaly detection, object counting) → Node-RED (workflow) → InfluxDB (time-series storage) → Grafana (dashboard).
- First edge AI deployment (mini PC + camera + YOLO) in 1-2 hours.
The cheap setup
Used Intel NUC or Lenovo ThinkCentre Tiny (~$200-300, 16 GB RAM, 256 GB SSD). Runs Ollama with 3B-7B models at 15-30 tok/s on CPU. YOLO11n object detection at 15-30 fps on integrated graphics. USB camera ($30). For a retail kiosk counting customers: $330 total (NUC + camera) runs customer counting, dwell time tracking, and basic analytics locally. For an industrial vibration sensor: Raspberry Pi 5 ($80) + accelerometer ($10) runs anomaly detection inference on vibration data locally. Edge AI at $300 gets you real-time inference on sensor data and camera feeds — the hardware is commodity.
The serious setup
For production edge AI: NVIDIA Jetson Orin NX 16 GB ($600, see /hardware/jetson-ai). Runs YOLO11x at 60+ fps for 4K video, Llama 3.2 3B at 30-50 tok/s, and Whisper medium for speech recognition — all simultaneously on 15-25W. For multi-camera industrial deployments (quality inspection, safety monitoring): one Jetson Orin handles 4-8 cameras. For rugged environments: Advantech or OnLogic industrial edge PCs ($1,500-2,500) with RTX A2000 GPU for harsh environments (heat, dust, vibration). Total per edge node: $600-2,500. Edge AI hardware cost is dominated by environmental requirements (IP rating, temperature range), not AI compute.
Common beginner mistake
The mistake: Treating edge AI like a mini datacenter — streaming all raw sensor data (camera feeds, vibration waveforms, audio) to a central server for inference, then wondering why the network bill is $5,000/month and latency is 500ms. Why it fails: Edge AI exists because bandwidth costs dominate at scale. 100 cameras at 1080p/30fps = 600 Mbps continuous upload. That's $2,000-5,000/month in cloud egress fees. Latency makes real-time applications (robot control, safety systems) impossible — 500ms round-trip is an eternity when a robot arm needs to stop in 50ms. The fix: Process at the edge. Run inference locally. Send only metadata to cloud (event counts, anomaly alerts, hourly summaries). The edge device tells the cloud "camera 3 detected an anomaly at 14:32" not "here's 24 hours of 1080p video." Edge AI is an architecture pattern, not a hardware spec. Process locally, aggregate centrally.
Recommended setup for edge ai
Browse all tools for runtimes that fit this workload.
Reality check
Local AI workloads have real hardware constraints that vary by task type. VRAM ceiling decides what model fits; bandwidth decides decode speed; compute decides prefill speed. Pick the GPU tier that fits your actual workload, not the spec sheet.
Common mistakes
- Buying for spec-sheet VRAM without modeling KV cache + activation overhead
- Underestimating quantization quality loss below Q4
- Skipping flash-attention support (real perf gap on long context)
- Ignoring sustained-load thermals (laptops thermal-throttle within 30 min)
What breaks first
The errors most operators hit when running edge ai locally. Each links to a diagnose+fix walkthrough.
Before you buy
Verify your specific hardware can handle edge ai before committing money.
Edge and embedded AI lives outside the desktop GPU world, but the iGPU and eGPU buyer questions still apply for the next tier up.