DeepSeek

Hangzhou-based open-weight LLM lab funded by a quant hedge fund — became the symbol of cost-efficient frontier AI after January 2025.

1. Core Product / Service

DeepSeek builds open-weight large language models with a Mixture-of-Experts (MoE) architecture. Public-facing surfaces:

DeepSeek API — chat + reasoning endpoints, OpenAI-compatible.
Open-weight releases on Hugging Face (V3, V3.1, V3.2, R1, V4 Flash, V4 Pro), code under MIT, weights under DeepSeek's own model license [1][2].
Web/app chat at chat.deepseek.com.

Latest releases (as of 2026-05-09):

V4 Pro (preview, 2026-04-24): 1.6T total / 49B active params, 1M context. Claims ~27% of V3.2's per-token inference FLOPs and ~10% of its KV cache at 1M context [3][4].
V4 Flash: 284B total / 13B active, 1M context [3].
R1: reasoning model line, originally released 2025-01-20 [5].

2. Target Users & Pain Points

Developers / startups wanting frontier-tier reasoning at a fraction of OpenAI/Anthropic prices.
Self-hosters / sovereign AI — open weights mean they can run on own GPUs (vLLM/SGLang Day-0 support; see ai-inference-engines).
Researchers — published technical reports for V3 and R1 are unusually detailed for a frontier lab.

Pain points addressed: closed-model lock-in, US-API egress cost, training-cost transparency.

3. Competitive Landscape

Lab	Origin	Open weights?	Latest flagship	Positioning
DeepSeek	China (Hangzhou)	Yes (MIT code + model license)	V4 Pro 1.6T MoE	Cheapest frontier-tier, MoE-native
OpenAI	US	No	GPT-5 family	Closed leader, premium price
Anthropic	US	No	Claude Opus/Sonnet 4.x	Closed, agentic / coding focus
kimi (Moonshot)	China	Partial (K2 weights open)	K2.6	Long-context + agent tooling
Qwen (Alibaba)	China	Yes (Apache 2.0)	Qwen3 family	Broadest model zoo, multimodal

Differentiation: DeepSeek is the most aggressive on inference cost per quality unit and on architecture novelty (MoE routing, multi-head latent attention). Strategic choice to upstream optimizations into ai-inference-engines (vLLM) rather than build their own engine — keeps team focused on model training [local: daily_log-2026-04-08.md].

4. Unique Observations

Inference-engine strategy: DeepSeek explicitly chose to merge optimizations back into vLLM and SGLang rather than launch a competing inference product. Reasoning per internal notes: a ~500-person team can't both train new models and maintain a multi-hardware inference stack — let the engine ecosystem distribute their models for them [local: daily_log-2026-04-08.md].
Search stack: independent Chinese labs (DeepSeek, Kimi, MiniMax) reportedly mix self-built crawl with Exa.ai for grounding, unlike Baidu/Alibaba/Tencent which use in-house engines [local: 2026-04-01-diary.md].
OpenRouter behavior quirk: DeepSeek models served via openrouter can have reasoning enabled by the host, while direct DeepSeek API defaults reasoning off — relevant for hermes-openrouter-models routing.
V4 inference-FLOP claim: V4 Pro's 27%-of-V3.2 single-token compute at 1M context is the headline architectural improvement, not raw benchmark scores [3][4].

5. Financials / Funding

No external venture funding. DeepSeek has not raised a public VC round; venture firms initially passed because there was no near-term exit [6].
Parent / funder: High-Flyer (幻方量化), a Chinese quantitative hedge fund founded 2016 in Hangzhou by Liang Wenfeng + Zhejiang University classmates. High-Flyer subsidizes DeepSeek's GPU clusters and operating costs [6][7].
Ownership: as of May 2024, Liang Wenfeng personally held ~84% of DeepSeek via two shell entities [6].
Disclosed training cost: V3 base model trained for ~$5.576M in H800 rental-equivalent (2,048× H800, ~55 days), plus ~$294K for the R1 RL phase — figures cover compute only, not salaries / data / failed runs / hardware capex [8]. The $5.6M number is the one that triggered the market reaction below.
Market impact event (2025-01-27): R1 release on 2025-01-20 plus the cheap-training narrative drove a ~17% Nvidia drop, wiping ~$589B in market cap in a single day — the largest single-day loss in US stock-market history at the time. Broadcom (-17%), Micron (-12%), AMD (-6%) sold off in sympathy [9]. Nvidia recovered ~76% from that low by April 2026.

6. People & Relationships

Founder / CEO: Liang Wenfeng (梁文锋) — also CEO of High-Flyer, Zhejiang University grad, started algorithmic trading during the 2008 financial crisis.
Parent: High-Flyer hedge fund (Hangzhou, founded 2016).
Ecosystem allies: vLLM / ai-inference-engines (DeepSeek upstreams MoE / expert-parallel optimizations there); SGLang (Day-0 support for V3/R1).
Distribution partners: openrouter (third-party host, often cheaper than direct API), DeepInfra, Together.
Peer Chinese labs: kimi (Moonshot), MiniMax, Qwen — overlapping but differentiated positioning (see section 3).