OpenRouter
Unified API marketplace for 300+ LLMs across 60+ providers — the "Stripe for inference," no markup on provider prices.
1. Core Product / Service
OpenRouter is a router/aggregator sitting in front of every major LLM provider. One API key, one OpenAI-compatible endpoint (/v1/chat/completions), and access to 300+ active models from 60+ providers as of 2025 [1].
Key capabilities:
- OpenAI-compatible API — drop-in replacement for
openaiSDK; switching models = changing a string. - Automatic failover — if the primary provider 5xx's, the request transparently re-routes to a backup provider for the same model.
- Routing modes — cheapest available, lowest latency, highest throughput, or pinned-provider.
- BYOK (Bring Your Own Key) — users can attach their own provider keys; OpenRouter charges 5% of the equivalent OpenRouter price after the first 1M requests/month [2].
- Credits + free tier — pay-as-you-go credits; some open-weight models (DeepSeek, Llama, GLM) have free-tier daily quotas.
- Caching / KV cache pass-through — passes through provider-side KV cache discounts (e.g. DeepSeek input-cache hit pricing).
It also publishes a public State of AI dataset — usage-share rankings across all routed traffic, which has become an industry-standard benchmark for "which models are people actually using" [3].
2. Target Users & Pain Points
Primary user: developers building LLM apps who don't want vendor lock-in.
Pain points solved:
- Billing fragmentation — instead of 6+ provider invoices (Anthropic, OpenAI, Google, DeepSeek, Moonshot, Together…), one credit balance.
- Provider switching cost — testing a new model goes from "open new account, get API key, integrate new SDK" to "change one model string."
- Reliability / failover — single provider outage no longer kills the app.
- Rate-limit pooling — OpenRouter's aggregate quota with each provider is far higher than what individuals can negotiate.
- Discovery — model cards with live pricing, latency, and throughput stats let devs shop without manual benchmarking.
Scale signal: 2.5M monthly users, 150K+ active monthly users, >1M developers lifetime, >50% of usage is non-US [3][4].
3. Competitive Landscape
| Feature | OpenRouter | together-ai | portkey-ai | eden-ai |
|---|---|---|---|---|
| Primary mode | Aggregator/router | Inference provider (own GPUs) + router | Gateway + observability | Aggregator |
| Model count | 300+ | ~200 (mostly OSS) | 50+ via passthrough | 100+ (incl. non-LLM AI) |
| Hosts own models | No (pure router) | Yes (own H100/H200 fleet) | No | No |
| Free tier | Generous (daily quota on OSS models) | Limited credits | None (paid SaaS) | Limited |
| Markup on inference | 0% (charges 5% on credit purchase) | Sets own price (margin built in) | 0% (charges per-request) | Has markup |
| Observability | Basic (logs, stats) | Basic | Enterprise (the moat) | Basic |
| Open-source SDK | OpenAI-compatible | Their own + OpenAI-compat | Multi-provider SDK | Multi-provider SDK |
Differentiation:
- vs. together-ai: Together hosts its own GPUs and competes on inference price + speed for OSS models. OpenRouter doesn't run hardware — it's a pure marketplace, including Together as one of its providers.
- vs. portkey-ai: Portkey leads on enterprise observability/governance (logs, policies, PII redaction). OpenRouter's observability is basic; the moat is breadth + free tier + community traffic.
- vs. eden-ai: Eden bundles non-LLM AI (vision, OCR, speech) with LLMs. OpenRouter is LLM-pure but deeper.
OpenRouter's actual moat: distribution (every "open source ChatGPT clone" defaults to OpenRouter) + the rankings dataset (publishing usage share creates a flywheel where providers compete to be on the leaderboard).
4. Unique Observations
Jimmy has been running OpenRouter as the default LLM provider in hermes-openrouter-models and has tracked behavior over months:
- Default model:
deepseek/deepseek-v3.2. Most aliases (qwen,qwen-flash,deepseek,glm) resolve through OpenRouter rather than direct provider connections (local:daily_log-2026-04-04.md). - GLM 5.1 update: Hermes config tracks
z-ai/glm-5.1with a 202k context window via OpenRouter routing. - Kimi via OpenRouter: Moonshot's Kimi K2 series is on OpenRouter and DeepInfra; third-party inference is sometimes cheaper than Moonshot direct because providers compete on margin (local:
2026-04-01-diary.md). For coding workloads Jimmy switched tokimi-codingvia direct Moonshot auth (see hermes-openrouter-models) — direct providers can offer prompt-caching discounts the OpenRouter passthrough doesn't always preserve. - Latency tax is real: routing through OpenRouter adds noticeable latency vs. direct. In Jimmy's TTFT benchmarks, MiniMax M2.7 via OpenRouter was 1.8s vs. Gemini 3.1 Pro direct at 1.1s (local:
2026-04-02-diary-claudecode.md). For latency-critical Claude Code sessions, direct provider auth wins. - Tempo MPP angle: Tempo's Money-Per-Prompt service exposes OpenRouter at
https://openrouter.mpp.tempo.xyz/v1/chat/completions, enabling crypto-native (x402-style) per-call payment with no Stripe account. Open question Jimmy has tracked: whether KV cache discounts pass through this proxy chain (see claude-code-sessions usage notes). - Provider economics insight: OpenRouter's 0% inference markup is the strategic anchor — they take a 5% credit-purchase fee instead, which is structurally better for trust (providers don't see a competitor undercutting them). This is what made OpenRouter the default for the Cline / Continue / OpenWebUI ecosystem.
- The "OpenRouter rankings as quality signal" observation: in agent payment / ai-inference-engines design discussions, Jimmy has flagged OpenRouter's per-model latency + uptime stats as a de facto reputation system for inference providers — i.e. it's already doing what an agent-payment QoS layer would need (local:
daily_log-2026-04-04.md).
5. Financials / Funding
- June 2025 — Seed + Series A combined: $40M (some sources $40.5M), led by Andreessen Horowitz (Seed) and Menlo Ventures (Series A); participation from Sequoia, Figma, and angels including Fred Ehrsam. Valuation reported at ~$500M [5][6][7].
- In talks (per Sacra, 2026): $120M round at $1.3B valuation — not yet confirmed closed [3].
- Inference spend processed (annualized run-rate):
- Oct 2024: $10M
- May 2025: $100M+
- by 2026: continuing to scale [3]
- OpenRouter's own revenue (annualized, Sacra estimate):
- May 2025: ~$5M
- Oct 2025: ~$10M
- Early 2026: ~$50M [3]
- Token throughput: ~5T tokens/week (Apr 2025) → >20T tokens/week (Apr 2026), ~4× YoY; >1T tokens/day by late 2025; cumulative dataset crossed 100T tokens for the State-of-AI report [1][3].
- Revenue model: 0% markup on inference; 5.5% fee on credit purchases ($0.80 min); 5% on crypto top-ups; 5% on BYOK above 1M req/month [2].
6. People & Relationships
- Founders:
- Alex Atallah — CEO. Former co-founder & CTO of OpenSea (the dominant NFT marketplace, 2018–2022). Stepped down from OpenSea in July 2022 "to build something zero-to-one"; founded OpenRouter in 2023 [6].
- Louis Vichy — co-founder [6].
- Lead investors: Andreessen Horowitz (Seed lead, also published the "100T token State of AI" co-report [4]), Menlo Ventures (Series A lead), Sequoia Capital, Figma, Fred Ehrsam (Coinbase / Paradigm) [5][6].
- Top providers (partners) — non-exhaustive:
- deepseek — among the highest-volume models on the platform (DeepSeek V3.2 / R1 routinely top the rankings).
- kimi (Moonshot) — Kimi K2 series available via OpenRouter; sometimes cheaper than direct.
- together-ai — listed as one of the inference backends for OSS models.
- Anthropic, OpenAI, Google (Gemini), xAI, Mistral, Meta (Llama), Z.ai (GLM), MiniMax, Alibaba (Qwen), DeepInfra, Fireworks, Groq — all plug in as routable providers.
Sources
[1] OpenRouter, "State of AI 2025: 100T Token LLM Usage Study," https://openrouter.ai/state-of-ai (2026-05-09) [2] OpenRouter Docs, "FAQ — Pricing & Fees," https://openrouter.ai/docs/faq (2026-05-09) [3] Sacra, "OpenRouter revenue, valuation & funding," https://sacra.com/c/openrouter/ (2026-05-09) [4] Andreessen Horowitz, "Investing in OpenRouter" + "State of AI: 100T Token Study," https://a16z.com/announcement/investing-in-openrouter/ (2026-05-09) [5] GlobeNewswire, "OpenRouter raises $40 million to scale up multi-model inference for enterprise," 2025-06-25, https://www.globenewswire.com/news-release/2025/06/25/3105125/0/en/OpenRouter-raises-40-million-to-scale-up-multi-model-inference-for-enterprise.html (2026-05-09) [6] The Block, "OpenSea co-founder Alex Atallah raises $40 million for AI startup OpenRouter," https://www.theblock.co/post/360093/opensea-co-founder-alex-atallah-raises-40-million-for-ai-startup-openrouter (2026-05-09) [7] Orrick, "AI Inference at Scale: OpenRouter Raises Series Seed and Series A Financing," https://www.orrick.com/en/News/2025/06/AI-Inference-at-Scale-OpenRouter-Raises-Series-Seed-and-Series-A-Financing (2026-05-09)
Local sources:
raw/2026-04-01-diary.md— Kimi via OpenRouter cheaper than direct; OpenRouter token-counting methodology (input+output combined, cache-aware)raw/2026-04-02-diary-claudecode.md— TTFT benchmark MiniMax-M2.7-via-OpenRouter (1.8s) vs. direct providers; gpt-4o-mini cost via Tempo MPPraw/daily_log-2026-04-04.md+raw/diary-claudecode-2026-04-04.md— Tempoopenrouter.mpp.tempo.xyzproxy, KV cache pass-through question, agent-payment QoS = OpenRouter rankings analogyraw/jclaw-2026-04-04.md— model-routing entity notes