Company
DeepSeek
Hangzhou-based open-weight LLM lab funded by a quant hedge fund — became the symbol of cost-efficient frontier AI after January 2025.
1. Core Product / Service
DeepSeek builds open-weight large language models with a Mixture-of-Experts (MoE) architecture. Public-facing surfaces:
- DeepSeek API — chat + reasoning endpoints, OpenAI-compatible.
- Open-weight releases on Hugging Face (V3, V3.1, V3.2, R1, V4 Flash, V4 Pro), code under MIT, weights under DeepSeek's own model license [1][2].
- Web/app chat at chat.deepseek.com.
Latest releases (as of 2026-05-09):
- V4 Pro (preview, 2026-04-24): 1.6T total / 49B active params, 1M context. Claims ~27% of V3.2's per-token inference FLOPs and ~10% of its KV cache at 1M context [3][4].
- V4 Flash: 284B total / 13B active, 1M context [3].
- R1: reasoning model line, originally released 2025-01-20 [5].
2. Target Users & Pain Points
- Developers / startups wanting frontier-tier reasoning at a fraction of OpenAI/Anthropic prices.
- Self-hosters / sovereign AI — open weights mean they can run on own GPUs (vLLM/SGLang Day-0 support; see ai-inference-engines).
- Researchers — published technical reports for V3 and R1 are unusually detailed for a frontier lab.
Pain points addressed: closed-model lock-in, US-API egress cost, training-cost transparency.
3. Competitive Landscape
| Lab | Origin | Open weights? | Latest flagship | Positioning |
|---|---|---|---|---|
| DeepSeek | China (Hangzhou) | Yes (MIT code + model license) | V4 Pro 1.6T MoE | Cheapest frontier-tier, MoE-native |
| OpenAI | US | No | GPT-5 family | Closed leader, premium price |
| Anthropic | US | No | Claude Opus/Sonnet 4.x | Closed, agentic / coding focus |
| kimi (Moonshot) | China | Partial (K2 weights open) | K2.6 | Long-context + agent tooling |
| Qwen (Alibaba) | China | Yes (Apache 2.0) | Qwen3 family | Broadest model zoo, multimodal |
Differentiation: DeepSeek is the most aggressive on inference cost per quality unit and on architecture novelty (MoE routing, multi-head latent attention). Strategic choice to upstream optimizations into ai-inference-engines (vLLM) rather than build their own engine — keeps team focused on model training [local: daily_log-2026-04-08.md].
4. Unique Observations
- Inference-engine strategy: DeepSeek explicitly chose to merge optimizations back into vLLM and SGLang rather than launch a competing inference product. Reasoning per internal notes: a ~500-person team can't both train new models and maintain a multi-hardware inference stack — let the engine ecosystem distribute their models for them [local: daily_log-2026-04-08.md].
- Search stack: independent Chinese labs (DeepSeek, Kimi, MiniMax) reportedly mix self-built crawl with Exa.ai for grounding, unlike Baidu/Alibaba/Tencent which use in-house engines [local: 2026-04-01-diary.md].
- OpenRouter behavior quirk: DeepSeek models served via openrouter can have reasoning enabled by the host, while direct DeepSeek API defaults reasoning off — relevant for hermes-openrouter-models routing.
- V4 inference-FLOP claim: V4 Pro's 27%-of-V3.2 single-token compute at 1M context is the headline architectural improvement, not raw benchmark scores [3][4].
5. Financials / Funding
- No external venture funding. DeepSeek has not raised a public VC round; venture firms initially passed because there was no near-term exit [6].
- Parent / funder: High-Flyer (幻方量化), a Chinese quantitative hedge fund founded 2016 in Hangzhou by Liang Wenfeng + Zhejiang University classmates. High-Flyer subsidizes DeepSeek's GPU clusters and operating costs [6][7].
- Ownership: as of May 2024, Liang Wenfeng personally held ~84% of DeepSeek via two shell entities [6].
- Disclosed training cost: V3 base model trained for ~$5.576M in H800 rental-equivalent (2,048× H800, ~55 days), plus ~$294K for the R1 RL phase — figures cover compute only, not salaries / data / failed runs / hardware capex [8]. The $5.6M number is the one that triggered the market reaction below.
- Market impact event (2025-01-27): R1 release on 2025-01-20 plus the cheap-training narrative drove a ~17% Nvidia drop, wiping ~$589B in market cap in a single day — the largest single-day loss in US stock-market history at the time. Broadcom (-17%), Micron (-12%), AMD (-6%) sold off in sympathy [9]. Nvidia recovered ~76% from that low by April 2026.
6. People & Relationships
- Founder / CEO: Liang Wenfeng (梁文锋) — also CEO of High-Flyer, Zhejiang University grad, started algorithmic trading during the 2008 financial crisis.
- Parent: High-Flyer hedge fund (Hangzhou, founded 2016).
- Ecosystem allies: vLLM / ai-inference-engines (DeepSeek upstreams MoE / expert-parallel optimizations there); SGLang (Day-0 support for V3/R1).
- Distribution partners: openrouter (third-party host, often cheaper than direct API), DeepInfra, Together.
- Peer Chinese labs: kimi (Moonshot), MiniMax, Qwen — overlapping but differentiated positioning (see section 3).
Related
Last compiled: 2026-05-09