RadixArk

The commercial vehicle for SGLang. Open-source inference engine + enterprise hosting + Miles RL post-training framework; Accel/Spark lead; $100M seed / $400M valuation.

1. Core Product / Service

SGLang (open-source inference engine): originated in 2023 from UC Berkeley LMSYS (Ion Stoica's lab); the core innovation is RadixAttention — radix-tree-based token-level KV cache automatic reuse, with higher prefix hit rates than vLLM's block-level hashing in multi-turn conversations and structured outputs [4]. Currently deployed across hundreds of thousands of GPUs, producing trillions of tokens per day, used by Google, Microsoft, NVIDIA, Oracle, AMD, Nebius, LinkedIn, xAI, Thinking Machines Lab [2].
Miles: enterprise-facing RL post-training framework for LLM/VLM post-training, forked from slime and co-evolving with it, open-sourced at github.com/radixark/miles [3]. Tightly coupled with the SGLang inference loop, the headline is fast RL training loop.
Enterprise hosted inference (commercial version): managed inference + hardware adaptation + SLA based on SGLang + Miles.

Performance comparison (H100, ref. 2026-04-07 notes): SGLang ~16,200 tok/s vs vLLM ~12,500 tok/s [local].

2. Target Users & Pain Points

Target customers: enterprises self-hosting LLMs, model companies, new hardware vendors (needing day-0 SGLang support).
Pain points:
- High repeated-prefix compute cost in multi-turn / agent workflows → RadixAttention solves this.
- Models iterate fast (DeepSeek V3.x, Kimi K2.x, Llama 4), in-house inference teams can't keep up → use SGLang for day-0 support.
- RL post-training infra and inference engine are fragmented → Miles + SGLang as one.

Officially recommended binding relationships (2026-04): deepseek V3/R1/V3-0324 day-0 support, one of the recommended for Meta Llama 4, Mistral Large 3, Moonshot kimi K2/K2.5 [local].

3. Competitive Landscape

	RadixArk (SGLang)	inferact (vLLM)	Modular (MAX + BentoML)	Fireworks AI	together-ai
Engine	SGLang (open source)	vLLM (open source)	MAX (proprietary)	Proprietary closed source	Proprietary + multi OSS
Valuation	$400M	$800M	$1.6B	~$10B+	Mid-high billions
Seed	$100M (2026-05)	$150M (2026-01)	—	Late stage	Late stage
Lead	Accel + Spark	a16z + Lightspeed	—	—	—
Core innovation	RadixAttention (token-level KV reuse)	PagedAttention (block hashing)	Mojo + compilation stack	Closed-source optimization	Closed source + OSS integration
Hardware coverage	Mainstream NVIDIA/AMD, gradually expanding	NVIDIA/AMD/TPU/Gaudi/Neuron	Proprietary compiler stack	NVIDIA-primary	Multiple

Differentiation summary: vLLM wins on breadth, community, operational simplicity; SGLang wins on prefix cache efficiency, structured outputs, agent workflows. The two are a duopoly; open-source stars vLLM ~65K vs SGLang ~16K [local].

4. Unique Observations

RL infra integration is the real moat: Inferact has no Miles-equivalent RL post-training framework. In ai-inference-engines, the only player covering inference + RL post-training in one stop is RadixArk. Customers don't just deploy models, they continuously train new versions — once coupled, switching is hard.
DeepSeek chose not to build its own commercialization: DeepSeek V3/R1 chose to contribute optimization back to vLLM rather than spin out its own inference company. Essentially, model companies don't want to be distracted by infra (ref. 2026-04-07 notes). This strategy actually gives SGLang/RadixArk room to do "day-0 adaptation + hosting" middleware.
Unusual investor composition: the seed round simultaneously took direct investment from NVIDIA (NVentures), AMD, and MediaTek — three chip vendors — plus angel checks from Intel CEO Lip-Bu Tan / Broadcom CEO Hock Tan. Hardware neutrality is almost forcibly written into the shareholder structure, a subtle differentiation from a16z-camp-bound Inferact.
See ai-inference-engines, gpu-kernel-optimization.

5. Financials / Funding

Round	Date	Amount	Valuation	Lead	Follow
Seed	2026-05-05	$100M	$400M post-money	Accel + Spark Capital (co-lead)	NVentures (NVIDIA), AMD, MediaTek, Salience Capital, HOF Capital, Walden Catalyst, A&E Investments, LDV Partners, WTT Investment [2]

Angel investor lineup: Igor Babuschkin (xAI co-founder), Lip-Bu Tan (Intel CEO), Hock Tan (Broadcom CEO), John Schulman (OpenAI / Thinking Machines), Soumith Chintala (PyTorch / Thinking Machines CTO), Olivier Pomel (Datadog), Thomas Wolf (Hugging Face), William Fedus (Periodic Labs), Robert Nishihara (Anyscale), Logan Kilpatrick (Gemini Product Lead) [2].

Note: TechCrunch reported in 2026-01 a rumored $400M valuation (pre-launch); at the 2026-05 launch, confirmed as $100M seed / $400M post-money [1][2].

6. People & Relationships

Founders:
- Ying Sheng (CEO) — former xAI engineer, one of the main authors of SGLang, UC Berkeley.
- Banghua Zhu (co-founder) — former NVIDIA systems background.
Academic origin: UC Berkeley LMSYS (Ion Stoica's lab), which also incubated vLLM → inferact.
Core investors: Accel (lead), Spark Capital (co-lead), NVIDIA NVentures, AMD, MediaTek, HOF Capital, Walden Catalyst.
Direct competitor: inferact (vLLM commercialization, same-lab rival).
Ecosystem customers (publicly disclosed SGLang usage): xAI, Google, Microsoft, NVIDIA, Oracle, Nebius, LinkedIn, Thinking Machines Lab; model-side official recommendations from deepseek, kimi.
Related infra entities: together-ai (inference API platform, partially based on vLLM/SGLang), runpod (GPU cloud, often used as the SGLang deployment substrate), openrouter (routing layer, SGLang as downstream backend).

Sources

[1] TechCrunch, "Sources: Project SGLang spins out as RadixArk with $400M valuation as inference market explodes", 2026-01-21
[2] BusinessWire, "RadixArk Launches with $100M in Seed Funding Led by Accel...", 2026-05-05
[3] GitHub radixark/miles README (accessed 2026-05-09)
[4] Accel, "Investing in RadixArk: Building the Open Infrastructure for AI" (accessed 2026-05-09)
local: 2026-04-01-diary.md (vLLM/SGLang baseline comparison)
local: daily_log-2026-04-08.md (Inferact / RadixArk session notes, performance data, DeepSeek strategic analysis)
local: ai-agent-platforms.md