Company

RadixArk

The commercial vehicle for SGLang. Open-source inference engine + enterprise hosting + Miles RL post-training framework; Accel/Spark lead; $100M seed / $400M valuation.

1. Core Product / Service

  • SGLang (open-source inference engine): originated in 2023 from UC Berkeley LMSYS (Ion Stoica's lab); the core innovation is RadixAttention — radix-tree-based token-level KV cache automatic reuse, with higher prefix hit rates than vLLM's block-level hashing in multi-turn conversations and structured outputs [4]. Currently deployed across hundreds of thousands of GPUs, producing trillions of tokens per day, used by Google, Microsoft, NVIDIA, Oracle, AMD, Nebius, LinkedIn, xAI, Thinking Machines Lab [2].
  • Miles: enterprise-facing RL post-training framework for LLM/VLM post-training, forked from slime and co-evolving with it, open-sourced at github.com/radixark/miles [3]. Tightly coupled with the SGLang inference loop, the headline is fast RL training loop.
  • Enterprise hosted inference (commercial version): managed inference + hardware adaptation + SLA based on SGLang + Miles.

Performance comparison (H100, ref. 2026-04-07 notes): SGLang ~16,200 tok/s vs vLLM ~12,500 tok/s [local].

2. Target Users & Pain Points

  • Target customers: enterprises self-hosting LLMs, model companies, new hardware vendors (needing day-0 SGLang support).
  • Pain points:
    • High repeated-prefix compute cost in multi-turn / agent workflows → RadixAttention solves this.
    • Models iterate fast (DeepSeek V3.x, Kimi K2.x, Llama 4), in-house inference teams can't keep up → use SGLang for day-0 support.
    • RL post-training infra and inference engine are fragmented → Miles + SGLang as one.

Officially recommended binding relationships (2026-04): deepseek V3/R1/V3-0324 day-0 support, one of the recommended for Meta Llama 4, Mistral Large 3, Moonshot kimi K2/K2.5 [local].

3. Competitive Landscape

RadixArk (SGLang) inferact (vLLM) Modular (MAX + BentoML) Fireworks AI together-ai
Engine SGLang (open source) vLLM (open source) MAX (proprietary) Proprietary closed source Proprietary + multi OSS
Valuation $400M $800M $1.6B ~$10B+ Mid-high billions
Seed $100M (2026-05) $150M (2026-01) Late stage Late stage
Lead Accel + Spark a16z + Lightspeed
Core innovation RadixAttention (token-level KV reuse) PagedAttention (block hashing) Mojo + compilation stack Closed-source optimization Closed source + OSS integration
Hardware coverage Mainstream NVIDIA/AMD, gradually expanding NVIDIA/AMD/TPU/Gaudi/Neuron Proprietary compiler stack NVIDIA-primary Multiple

Differentiation summary: vLLM wins on breadth, community, operational simplicity; SGLang wins on prefix cache efficiency, structured outputs, agent workflows. The two are a duopoly; open-source stars vLLM ~65K vs SGLang ~16K [local].

4. Unique Observations

  • RL infra integration is the real moat: Inferact has no Miles-equivalent RL post-training framework. In ai-inference-engines, the only player covering inference + RL post-training in one stop is RadixArk. Customers don't just deploy models, they continuously train new versions — once coupled, switching is hard.
  • DeepSeek chose not to build its own commercialization: DeepSeek V3/R1 chose to contribute optimization back to vLLM rather than spin out its own inference company. Essentially, model companies don't want to be distracted by infra (ref. 2026-04-07 notes). This strategy actually gives SGLang/RadixArk room to do "day-0 adaptation + hosting" middleware.
  • Unusual investor composition: the seed round simultaneously took direct investment from NVIDIA (NVentures), AMD, and MediaTek — three chip vendors — plus angel checks from Intel CEO Lip-Bu Tan / Broadcom CEO Hock Tan. Hardware neutrality is almost forcibly written into the shareholder structure, a subtle differentiation from a16z-camp-bound Inferact.
  • See ai-inference-engines, gpu-kernel-optimization.

5. Financials / Funding

Round Date Amount Valuation Lead Follow
Seed 2026-05-05 $100M $400M post-money Accel + Spark Capital (co-lead) NVentures (NVIDIA), AMD, MediaTek, Salience Capital, HOF Capital, Walden Catalyst, A&E Investments, LDV Partners, WTT Investment [2]

Angel investor lineup: Igor Babuschkin (xAI co-founder), Lip-Bu Tan (Intel CEO), Hock Tan (Broadcom CEO), John Schulman (OpenAI / Thinking Machines), Soumith Chintala (PyTorch / Thinking Machines CTO), Olivier Pomel (Datadog), Thomas Wolf (Hugging Face), William Fedus (Periodic Labs), Robert Nishihara (Anyscale), Logan Kilpatrick (Gemini Product Lead) [2].

Note: TechCrunch reported in 2026-01 a rumored $400M valuation (pre-launch); at the 2026-05 launch, confirmed as $100M seed / $400M post-money [1][2].

6. People & Relationships

  • Founders:
    • Ying Sheng (CEO) — former xAI engineer, one of the main authors of SGLang, UC Berkeley.
    • Banghua Zhu (co-founder) — former NVIDIA systems background.
  • Academic origin: UC Berkeley LMSYS (Ion Stoica's lab), which also incubated vLLM → inferact.
  • Core investors: Accel (lead), Spark Capital (co-lead), NVIDIA NVentures, AMD, MediaTek, HOF Capital, Walden Catalyst.
  • Direct competitor: inferact (vLLM commercialization, same-lab rival).
  • Ecosystem customers (publicly disclosed SGLang usage): xAI, Google, Microsoft, NVIDIA, Oracle, Nebius, LinkedIn, Thinking Machines Lab; model-side official recommendations from deepseek, kimi.
  • Related infra entities: together-ai (inference API platform, partially based on vLLM/SGLang), runpod (GPU cloud, often used as the SGLang deployment substrate), openrouter (routing layer, SGLang as downstream backend).

Sources

  • [1] TechCrunch, "Sources: Project SGLang spins out as RadixArk with $400M valuation as inference market explodes", 2026-01-21
  • [2] BusinessWire, "RadixArk Launches with $100M in Seed Funding Led by Accel...", 2026-05-05
  • [3] GitHub radixark/miles README (accessed 2026-05-09)
  • [4] Accel, "Investing in RadixArk: Building the Open Infrastructure for AI" (accessed 2026-05-09)
  • local: 2026-04-01-diary.md (vLLM/SGLang baseline comparison)
  • local: daily_log-2026-04-08.md (Inferact / RadixArk session notes, performance data, DeepSeek strategic analysis)
  • local: ai-agent-platforms.md
Last compiled: 2026-05-09