OpenRouter

Unified API marketplace for 300+ LLMs across 60+ providers — the "Stripe for inference," no markup on provider prices.

1. Core Product / Service

OpenRouter is a router/aggregator sitting in front of every major LLM provider. One API key, one OpenAI-compatible endpoint (/v1/chat/completions), and access to 300+ active models from 60+ providers as of 2025 [1].

Key capabilities:

OpenAI-compatible API — drop-in replacement for openai SDK; switching models = changing a string.
Automatic failover — if the primary provider 5xx's, the request transparently re-routes to a backup provider for the same model.
Routing modes — cheapest available, lowest latency, highest throughput, or pinned-provider.
BYOK (Bring Your Own Key) — users can attach their own provider keys; OpenRouter charges 5% of the equivalent OpenRouter price after the first 1M requests/month [2].
Credits + free tier — pay-as-you-go credits; some open-weight models (DeepSeek, Llama, GLM) have free-tier daily quotas.
Caching / KV cache pass-through — passes through provider-side KV cache discounts (e.g. DeepSeek input-cache hit pricing).

It also publishes a public State of AI dataset — usage-share rankings across all routed traffic, which has become an industry-standard benchmark for "which models are people actually using" [3].

2. Target Users & Pain Points

Primary user: developers building LLM apps who don't want vendor lock-in.

Pain points solved:

Billing fragmentation — instead of 6+ provider invoices (Anthropic, OpenAI, Google, DeepSeek, Moonshot, Together…), one credit balance.
Provider switching cost — testing a new model goes from "open new account, get API key, integrate new SDK" to "change one model string."
Reliability / failover — single provider outage no longer kills the app.
Rate-limit pooling — OpenRouter's aggregate quota with each provider is far higher than what individuals can negotiate.
Discovery — model cards with live pricing, latency, and throughput stats let devs shop without manual benchmarking.

Scale signal: 2.5M monthly users, 150K+ active monthly users, >1M developers lifetime, >50% of usage is non-US [3][4].

3. Competitive Landscape

Feature	OpenRouter	together-ai	portkey-ai	eden-ai
Primary mode	Aggregator/router	Inference provider (own GPUs) + router	Gateway + observability	Aggregator
Model count	300+	~200 (mostly OSS)	50+ via passthrough	100+ (incl. non-LLM AI)
Hosts own models	No (pure router)	Yes (own H100/H200 fleet)	No	No
Free tier	Generous (daily quota on OSS models)	Limited credits	None (paid SaaS)	Limited
Markup on inference	0% (charges 5% on credit purchase)	Sets own price (margin built in)	0% (charges per-request)	Has markup
Observability	Basic (logs, stats)	Basic	Enterprise (the moat)	Basic
Open-source SDK	OpenAI-compatible	Their own + OpenAI-compat	Multi-provider SDK	Multi-provider SDK

Differentiation:

vs. together-ai: Together hosts its own GPUs and competes on inference price + speed for OSS models. OpenRouter doesn't run hardware — it's a pure marketplace, including Together as one of its providers.
vs. portkey-ai: Portkey leads on enterprise observability/governance (logs, policies, PII redaction). OpenRouter's observability is basic; the moat is breadth + free tier + community traffic.
vs. eden-ai: Eden bundles non-LLM AI (vision, OCR, speech) with LLMs. OpenRouter is LLM-pure but deeper.

OpenRouter's actual moat: distribution (every "open source ChatGPT clone" defaults to OpenRouter) + the rankings dataset (publishing usage share creates a flywheel where providers compete to be on the leaderboard).

4. Unique Observations

Jimmy has been running OpenRouter as the default LLM provider in hermes-openrouter-models and has tracked behavior over months:

Default model: deepseek/deepseek-v3.2. Most aliases (qwen, qwen-flash, deepseek, glm) resolve through OpenRouter rather than direct provider connections (local: daily_log-2026-04-04.md).
GLM 5.1 update: Hermes config tracks z-ai/glm-5.1 with a 202k context window via OpenRouter routing.
Kimi via OpenRouter: Moonshot's Kimi K2 series is on OpenRouter and DeepInfra; third-party inference is sometimes cheaper than Moonshot direct because providers compete on margin (local: 2026-04-01-diary.md). For coding workloads Jimmy switched to kimi-coding via direct Moonshot auth (see hermes-openrouter-models) — direct providers can offer prompt-caching discounts the OpenRouter passthrough doesn't always preserve.
Latency tax is real: routing through OpenRouter adds noticeable latency vs. direct. In Jimmy's TTFT benchmarks, MiniMax M2.7 via OpenRouter was 1.8s vs. Gemini 3.1 Pro direct at 1.1s (local: 2026-04-02-diary-claudecode.md). For latency-critical Claude Code sessions, direct provider auth wins.
Tempo MPP angle: Tempo's Money-Per-Prompt service exposes OpenRouter at https://openrouter.mpp.tempo.xyz/v1/chat/completions, enabling crypto-native (x402-style) per-call payment with no Stripe account. Open question Jimmy has tracked: whether KV cache discounts pass through this proxy chain (see claude-code-sessions usage notes).
Provider economics insight: OpenRouter's 0% inference markup is the strategic anchor — they take a 5% credit-purchase fee instead, which is structurally better for trust (providers don't see a competitor undercutting them). This is what made OpenRouter the default for the Cline / Continue / OpenWebUI ecosystem.
The "OpenRouter rankings as quality signal" observation: in agent payment / ai-inference-engines design discussions, Jimmy has flagged OpenRouter's per-model latency + uptime stats as a de facto reputation system for inference providers — i.e. it's already doing what an agent-payment QoS layer would need (local: daily_log-2026-04-04.md).

5. Financials / Funding

June 2025 — Seed + Series A combined: $40M (some sources $40.5M), led by Andreessen Horowitz (Seed) and Menlo Ventures (Series A); participation from Sequoia, Figma, and angels including Fred Ehrsam. Valuation reported at ~$500M [5][6][7].
In talks (per Sacra, 2026): $120M round at $1.3B valuation — not yet confirmed closed [3].
Inference spend processed (annualized run-rate):
- Oct 2024: $10M
- May 2025: $100M+
- by 2026: continuing to scale [3]
OpenRouter's own revenue (annualized, Sacra estimate):
- May 2025: ~$5M
- Oct 2025: ~$10M
- Early 2026: ~$50M [3]
Token throughput: ~5T tokens/week (Apr 2025) → >20T tokens/week (Apr 2026), ~4× YoY; >1T tokens/day by late 2025; cumulative dataset crossed 100T tokens for the State-of-AI report [1][3].
Revenue model: 0% markup on inference; 5.5% fee on credit purchases ($0.80 min); 5% on crypto top-ups; 5% on BYOK above 1M req/month [2].

6. People & Relationships

Founders:
- Alex Atallah — CEO. Former co-founder & CTO of OpenSea (the dominant NFT marketplace, 2018–2022). Stepped down from OpenSea in July 2022 "to build something zero-to-one"; founded OpenRouter in 2023 [6].
- Louis Vichy — co-founder [6].
Lead investors: Andreessen Horowitz (Seed lead, also published the "100T token State of AI" co-report [4]), Menlo Ventures (Series A lead), Sequoia Capital, Figma, Fred Ehrsam (Coinbase / Paradigm) [5][6].
Top providers (partners) — non-exhaustive:
- deepseek — among the highest-volume models on the platform (DeepSeek V3.2 / R1 routinely top the rankings).
- kimi (Moonshot) — Kimi K2 series available via OpenRouter; sometimes cheaper than direct.
- together-ai — listed as one of the inference backends for OSS models.
- Anthropic, OpenAI, Google (Gemini), xAI, Mistral, Meta (Llama), Z.ai (GLM), MiniMax, Alibaba (Qwen), DeepInfra, Fireworks, Groq — all plug in as routable providers.

Sources

[1] OpenRouter, "State of AI 2025: 100T Token LLM Usage Study," https://openrouter.ai/state-of-ai (2026-05-09) [2] OpenRouter Docs, "FAQ — Pricing & Fees," https://openrouter.ai/docs/faq (2026-05-09) [3] Sacra, "OpenRouter revenue, valuation & funding," https://sacra.com/c/openrouter/ (2026-05-09) [4] Andreessen Horowitz, "Investing in OpenRouter" + "State of AI: 100T Token Study," https://a16z.com/announcement/investing-in-openrouter/ (2026-05-09) [5] GlobeNewswire, "OpenRouter raises $40 million to scale up multi-model inference for enterprise," 2025-06-25, https://www.globenewswire.com/news-release/2025/06/25/3105125/0/en/OpenRouter-raises-40-million-to-scale-up-multi-model-inference-for-enterprise.html (2026-05-09) [6] The Block, "OpenSea co-founder Alex Atallah raises $40 million for AI startup OpenRouter," https://www.theblock.co/post/360093/opensea-co-founder-alex-atallah-raises-40-million-for-ai-startup-openrouter (2026-05-09) [7] Orrick, "AI Inference at Scale: OpenRouter Raises Series Seed and Series A Financing," https://www.orrick.com/en/News/2025/06/AI-Inference-at-Scale-OpenRouter-Raises-Series-Seed-and-Series-A-Financing (2026-05-09)

Local sources:

raw/2026-04-01-diary.md — Kimi via OpenRouter cheaper than direct; OpenRouter token-counting methodology (input+output combined, cache-aware)
raw/2026-04-02-diary-claudecode.md — TTFT benchmark MiniMax-M2.7-via-OpenRouter (1.8s) vs. direct providers; gpt-4o-mini cost via Tempo MPP
raw/daily_log-2026-04-04.md + raw/diary-claudecode-2026-04-04.md — Tempo openrouter.mpp.tempo.xyz proxy, KV cache pass-through question, agent-payment QoS = OpenRouter rankings analogy
raw/jclaw-2026-04-04.md — model-routing entity notes