Stream Visualizer

Watch tokens reveal at the exact speed your hardware would produce them. Pick a GPU + model + cloud comparison, hit ⟳ Race, see which finishes first. Visceral way to feel the difference between 32 tok/s and 300 tok/s.

No inference actually runs — we animate a sample at the bandwidth-derived tok/s for the chosen rig. Same math as the quant advisor and cost-vs-cloud calculator. Cloud baselines are median observed streaming rates from US POPs, not provider-advertised TPS.

Configure the race

URL updates as you change fields — share by ⎘ Copy.

Your local hardware

Local model (Q4_K_M assumed)

Sample output

Cloud comparison

Live stream at your hardware's speed

TypeScript async/await refactor — typical Cline/Aider output

Pick a local hardware + model to start the stream.

Where to go from here

Quant Advisor →

Local TPS depends heavily on quant. Drill into Q4 vs Q5 vs Q6 for the picked model + hardware.

Cost vs Cloud →

Speed is one half of the cloud-vs-local math. See the money side — exactly what each provider would charge.

Stack Builder →

Like the speed but unsure about the rest? Go back one step — pick the whole stack from use case down.

Submit a benchmark →

The TPS above is extrapolated. If you have a measured number for this hardware + model + quant combo, submit it — it'll replace the estimate.