RadixArk
The commercial vehicle for SGLang. Open-source inference engine + enterprise hosting + Miles RL post-training framework; Accel/Spark lead; $100M seed / $400M valuation.
1. Core Product / Service
- SGLang (open-source inference engine): originated in 2023 from UC Berkeley LMSYS (Ion Stoica's lab); the core innovation is RadixAttention — radix-tree-based token-level KV cache automatic reuse, with higher prefix hit rates than vLLM's block-level hashing in multi-turn conversations and structured outputs [4]. Currently deployed across hundreds of thousands of GPUs, producing trillions of tokens per day, used by Google, Microsoft, NVIDIA, Oracle, AMD, Nebius, LinkedIn, xAI, Thinking Machines Lab [2].
- Miles: enterprise-facing RL post-training framework for LLM/VLM post-training, forked from
slimeand co-evolving with it, open-sourced atgithub.com/radixark/miles[3]. Tightly coupled with the SGLang inference loop, the headline is fast RL training loop. - Enterprise hosted inference (commercial version): managed inference + hardware adaptation + SLA based on SGLang + Miles.
Performance comparison (H100, ref. 2026-04-07 notes): SGLang ~16,200 tok/s vs vLLM ~12,500 tok/s [local].
2. Target Users & Pain Points
- Target customers: enterprises self-hosting LLMs, model companies, new hardware vendors (needing day-0 SGLang support).
- Pain points:
- High repeated-prefix compute cost in multi-turn / agent workflows → RadixAttention solves this.
- Models iterate fast (DeepSeek V3.x, Kimi K2.x, Llama 4), in-house inference teams can't keep up → use SGLang for day-0 support.
- RL post-training infra and inference engine are fragmented → Miles + SGLang as one.
Officially recommended binding relationships (2026-04): deepseek V3/R1/V3-0324 day-0 support, one of the recommended for Meta Llama 4, Mistral Large 3, Moonshot kimi K2/K2.5 [local].
3. Competitive Landscape
| RadixArk (SGLang) | inferact (vLLM) | Modular (MAX + BentoML) | Fireworks AI | together-ai | |
|---|---|---|---|---|---|
| Engine | SGLang (open source) | vLLM (open source) | MAX (proprietary) | Proprietary closed source | Proprietary + multi OSS |
| Valuation | $400M | $800M | $1.6B | ~$10B+ | Mid-high billions |
| Seed | $100M (2026-05) | $150M (2026-01) | — | Late stage | Late stage |
| Lead | Accel + Spark | a16z + Lightspeed | — | — | — |
| Core innovation | RadixAttention (token-level KV reuse) | PagedAttention (block hashing) | Mojo + compilation stack | Closed-source optimization | Closed source + OSS integration |
| Hardware coverage | Mainstream NVIDIA/AMD, gradually expanding | NVIDIA/AMD/TPU/Gaudi/Neuron | Proprietary compiler stack | NVIDIA-primary | Multiple |
Differentiation summary: vLLM wins on breadth, community, operational simplicity; SGLang wins on prefix cache efficiency, structured outputs, agent workflows. The two are a duopoly; open-source stars vLLM ~65K vs SGLang ~16K [local].
4. Unique Observations
- RL infra integration is the real moat: Inferact has no Miles-equivalent RL post-training framework. In ai-inference-engines, the only player covering inference + RL post-training in one stop is RadixArk. Customers don't just deploy models, they continuously train new versions — once coupled, switching is hard.
- DeepSeek chose not to build its own commercialization: DeepSeek V3/R1 chose to contribute optimization back to vLLM rather than spin out its own inference company. Essentially, model companies don't want to be distracted by infra (ref. 2026-04-07 notes). This strategy actually gives SGLang/RadixArk room to do "day-0 adaptation + hosting" middleware.
- Unusual investor composition: the seed round simultaneously took direct investment from NVIDIA (NVentures), AMD, and MediaTek — three chip vendors — plus angel checks from Intel CEO Lip-Bu Tan / Broadcom CEO Hock Tan. Hardware neutrality is almost forcibly written into the shareholder structure, a subtle differentiation from a16z-camp-bound Inferact.
- See ai-inference-engines, gpu-kernel-optimization.
5. Financials / Funding
| Round | Date | Amount | Valuation | Lead | Follow |
|---|---|---|---|---|---|
| Seed | 2026-05-05 | $100M | $400M post-money | Accel + Spark Capital (co-lead) | NVentures (NVIDIA), AMD, MediaTek, Salience Capital, HOF Capital, Walden Catalyst, A&E Investments, LDV Partners, WTT Investment [2] |
Angel investor lineup: Igor Babuschkin (xAI co-founder), Lip-Bu Tan (Intel CEO), Hock Tan (Broadcom CEO), John Schulman (OpenAI / Thinking Machines), Soumith Chintala (PyTorch / Thinking Machines CTO), Olivier Pomel (Datadog), Thomas Wolf (Hugging Face), William Fedus (Periodic Labs), Robert Nishihara (Anyscale), Logan Kilpatrick (Gemini Product Lead) [2].
Note: TechCrunch reported in 2026-01 a rumored $400M valuation (pre-launch); at the 2026-05 launch, confirmed as $100M seed / $400M post-money [1][2].
6. People & Relationships
- Founders:
- Ying Sheng (CEO) — former xAI engineer, one of the main authors of SGLang, UC Berkeley.
- Banghua Zhu (co-founder) — former NVIDIA systems background.
- Academic origin: UC Berkeley LMSYS (Ion Stoica's lab), which also incubated vLLM → inferact.
- Core investors: Accel (lead), Spark Capital (co-lead), NVIDIA NVentures, AMD, MediaTek, HOF Capital, Walden Catalyst.
- Direct competitor: inferact (vLLM commercialization, same-lab rival).
- Ecosystem customers (publicly disclosed SGLang usage): xAI, Google, Microsoft, NVIDIA, Oracle, Nebius, LinkedIn, Thinking Machines Lab; model-side official recommendations from deepseek, kimi.
- Related infra entities: together-ai (inference API platform, partially based on vLLM/SGLang), runpod (GPU cloud, often used as the SGLang deployment substrate), openrouter (routing layer, SGLang as downstream backend).
Sources
- [1] TechCrunch, "Sources: Project SGLang spins out as RadixArk with $400M valuation as inference market explodes", 2026-01-21
- [2] BusinessWire, "RadixArk Launches with $100M in Seed Funding Led by Accel...", 2026-05-05
- [3] GitHub
radixark/milesREADME (accessed 2026-05-09) - [4] Accel, "Investing in RadixArk: Building the Open Infrastructure for AI" (accessed 2026-05-09)
- local:
2026-04-01-diary.md(vLLM/SGLang baseline comparison) - local:
daily_log-2026-04-08.md(Inferact / RadixArk session notes, performance data, DeepSeek strategic analysis) - local:
ai-agent-platforms.md