Google Coral / Edge TPU
Google Coral Edge TPU for low-power on-device inference. TensorFlow Lite quantized models.
Setup walkthrough
- Buy a Google Coral USB Accelerator ($60) or Coral Dev Board Mini ($100). The Coral Edge TPU is a dedicated INT8 inference chip — 4 TOPS at 2W.
pip install pycoral→ download a pre-compiled TensorFlow Lite model from coral.ai/models (classification, detection, segmentation, pose estimation).- Python script for object detection:
from pycoral.utils import edgetpu
from pycoral.adapters import common
import cv2
interpreter = edgetpu.make_interpreter("mobilenet_ssd_v2_coco_quant_edgetpu.tflite")
interpreter.allocate_tensors()
img = cv2.imread("photo.jpg")
common.set_input(interpreter, img)
interpreter.invoke()
detections = common.get_output(interpreter)
for d in detections:
print(f"{d.id}: {d.score:.2f} at ({d.bbox.xmin}, {d.bbox.ymin})")
- First detection in <10 ms — the Edge TPU runs at 100+ fps for MobileNet-class models.
- Critical limitation: Coral TPU only runs TensorFlow Lite INT8 models that are compiled specifically for the Edge TPU. You can't run PyTorch models, LLMs, or mixed-precision models. Coral is for vision classification/detection/segmentation, not general-purpose AI.
- For custom models: train in TensorFlow → quantize to INT8 → compile with Edge TPU Compiler → deploy. The compiler is strict — unsupported ops (many common ones) fail.
The cheap setup
Google Coral USB Accelerator (~$60) + Raspberry Pi 4 ($55) + Pi Camera ($25) + SD card ($10). Total: ~$150. Runs MobileNet SSD at 100+ fps, PoseNet at 30+ fps, face detection at 100+ fps. For a DIY security camera with person detection: $150 handles real-time detection with <10ms inference latency at 2W power. For a wildlife camera classifying animals: Coral + Pi + solar panel = a $200 self-powered AI camera that runs for months. Coral is the cheapest way to add GPU-class vision inference to a Raspberry Pi. The tradeoff: only TensorFlow Lite vision models.
The serious setup
Coral doesn't scale to "serious" — it's deliberately capped. A Coral Dev Board Micro ($100) maxes out at 4 TOPS INT8, ~300 fps for MobileNet. For production edge vision (factory QA, retail analytics): deploy multiple Coral devices per camera, or upgrade to Jetson Orin Nano ($500, see /hardware/jetson-ai) for 40 TOPS + general-purpose CUDA. Coral's niche is ultra-low-power, ultra-low-cost, single-purpose vision inference. For general-purpose edge AI (LLMs, multi-model pipelines, custom architectures), Coral is the wrong tool. Use Coral when you need: <5W, <$100, vision-only, supported TF Lite ops. Use Jetson for everything else.
Common beginner mistake
The mistake: Buying a Coral TPU, installing PyTorch, and trying to run a Llama model on it — getting "unsupported operation" errors for every layer. Why it fails: The Edge TPU is not a GPU. It's an ASIC that executes exactly one format: TensorFlow Lite INT8 models with a limited op set. No PyTorch. No FP16. No attention mechanisms. No dynamic shapes. It's designed for convolutional vision models (MobileNet, EfficientNet). The fix: Check the Edge TPU supported ops list before designing your model (coral.ai/docs/edgetpu/models-intro). If your model uses any unsupported op (LSTM, attention, custom layers, most normalization ops), the compiler rejects it. Coral is for "camera sees something → classify/detect" workflows. For anything involving transformers, LLMs, or generative models, use a Jetson or x86 edge PC. Coral is a one-trick pony — but for that one trick (real-time vision classification), it's the best pony on the market at 2W.
Recommended setup for google coral / edge tpu
Browse all tools for runtimes that fit this workload.
Reality check
Local AI workloads have real hardware constraints that vary by task type. VRAM ceiling decides what model fits; bandwidth decides decode speed; compute decides prefill speed. Pick the GPU tier that fits your actual workload, not the spec sheet.
Common mistakes
- Buying for spec-sheet VRAM without modeling KV cache + activation overhead
- Underestimating quantization quality loss below Q4
- Skipping flash-attention support (real perf gap on long context)
- Ignoring sustained-load thermals (laptops thermal-throttle within 30 min)
What breaks first
The errors most operators hit when running google coral / edge tpu locally. Each links to a diagnose+fix walkthrough.
Before you buy
Verify your specific hardware can handle google coral / edge tpu before committing money.