RunLocalAI Public API
JSON endpoints for the model directory, hardware specs, tool reviews, and benchmark data. Free tier: 1,000 calls per month with email signup. Build on top of us — every integration becomes a backlink, and the data improves as we run more benchmarks.
Quick start
Three steps from zero to your first response.
curl -X POST https://runlocalai.co/api/v1/keys \
-H "Content-Type: application/json" \
-d '{
"name": "Your Name",
"email": "you@example.com",
"use_case": "Building a hardware compatibility checker for our internal devs"
}'curl https://runlocalai.co/api/v1/models \
-H "Authorization: Bearer rla_xxx..."X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Tier: freeEndpoints
/api/v1/models
List all published open-weight models.
family— llama, qwen, mistral, phi, gemma, deepseek, etc.min_params_b/max_params_b— bracket by parameter countlimit— default 100, max 200
curl 'https://runlocalai.co/api/v1/models?family=qwen&min_params_b=7&limit=20' \
-H "Authorization: Bearer $RLA_KEY"/api/v1/models/{slug}
Single model with full architecture metadata + benchmarks.
curl https://runlocalai.co/api/v1/models/llama-3.1-8b-instruct \
-H "Authorization: Bearer $RLA_KEY"Includes per-variant quantization sizes, KV-cache architecture (num_kv_heads, head_dim, etc.), and every benchmark we have for this model.
/api/v1/hardware
List all GPUs, SoCs, and laptops in the database.
vendor— nvidia, amd, intel, apple, qualcommtype— gpu, cpu, apu, soc, laptop, desktopmin_vram_gb— filter by minimum VRAM
curl 'https://runlocalai.co/api/v1/hardware?vendor=nvidia&min_vram_gb=16' \
-H "Authorization: Bearer $RLA_KEY"Returns memory bandwidth, FP16 TFLOPS, backend support, current street price.
/api/v1/tools
Runners, GUIs, agents — the local-AI tooling landscape.
curl 'https://runlocalai.co/api/v1/tools?category=runner' \
-H "Authorization: Bearer $RLA_KEY"Categories: runner, gui, server, orchestrator, finetuner, quantizer, agent, ide.
/api/v1/benchmarks
Tokens-per-second measurements with full provenance.
model,hardware,tool— filter by entity slugverified=trueorowner_only=true— only owner-run benchmarkslimit— default 100, max 500
curl 'https://runlocalai.co/api/v1/benchmarks?hardware=rtx-5080&owner_only=true' \
-H "Authorization: Bearer $RLA_KEY"Each benchmark includes source ('owner' / 'community' / 'official'), source URL when community-sourced, and run date.
Rate limits & tiers
Quota resets on the first of each month UTC. Burst rate limit is 10 requests per second. Responses are cached at the CDN edge for 5 minutes.
Attribution
We don't require attribution to use the API, but we appreciate it. A simple link works:
Data: <a href="https://runlocalai.co">RunLocalAI</a>For research papers, please cite the page URL of the relevant model/hardware/benchmark and include the access date.
Errors
401— Missing or invalid API key403— Key revoked404— Entity not found by slug429— Monthly quota exhausted (resets first of next month UTC)500— Database or upstream issue. Email hello@runlocalai.co with the request ID.
Building something with this API? Tell us — we'll feature good integrations on the homepage.