Company
SambaNova Systems
Proprietary RDU (Reconfigurable Dataflow Unit) + full-stack AI system; a core player in the original enterprise / government market, pivoting to open external token APIs from 2024.
1. Core Product / Service
SambaNova is an AI systems company vertically integrated down to the chip, with three product layers:
- SN40L RDU: second-generation reconfigurable dataflow chip (HBM3 + three-tier memory hierarchy); the strength is fitting very large models (Llama 405B class) on a single system without massive model parallelism — minimal sharding leads to low latency.
- DataScale systems / SambaNova Suite: bundles RDU + software + pre-trained models to sell to enterprises / governments, typically in the form of an "AI-as-a-Service" integrated contract.
- SambaNova Cloud (2024 launch): externally-open token API, serverless billing — Llama 3.1 405B, Llama 3.3 70B, DeepSeek V3 / R1, Qwen2.5, Whisper [2]. This is SambaNova's key product pivot from "selling enterprise systems" to "selling tokens".
2. Target Users & Pain Points
- Government / sovereign AI / regulated industries: federal agencies, banks, healthcare — needing on-prem, needing isolation, not accepting cloud-only. SambaNova's long-standing strength.
- Enterprise model fine-tuning + deployment in one: buying SambaNova gets you chip + software + a fine-tuned open-source model suite — simplifying procurement.
- Pain point: traditional NVIDIA path needs separate GPU purchases + self-built inference stack + ops; SambaNova is "turn-key AI rack", but also has the strongest lock-in (proprietary SDK).
- SambaNova Cloud pain-point positioning: using the same RDU systems to externally sell tokens, providing a sampling entry for developers — but whether it can long-term compete with Groq / Cerebras on speed and pricing is still to be observed.
3. Competitive Landscape
| Competitor | Positioning | Vs. SambaNova |
|---|---|---|
| groq | LPU + speed narrative | Both non-GPU; Groq goes developer cloud, SambaNova goes enterprise / government |
| cerebras | Wafer-scale WSE-3 | Both non-mainstream hardware paths; Cerebras is more aggressive on single-chip, SambaNova's system integration is more mature |
| NVIDIA + DGX Cloud | GPU + ecosystem | NVIDIA's general purpose crushes; SambaNova differentiates on "full-stack integrated contracts" |
| fireworks-ai / together-ai | GPU software stack | SambaNova has a structural advantage on single-stream latency for very large 405B-class models |
| AWS Bedrock / Azure | Cloud-vendor hosted | Clouds win on distribution; SambaNova wins on private deployment |
Differentiation: RDU design + three-tier memory + full-stack packaging. On Llama 405B-class very-large-model single-stream inference, low sharding → low latency is a physical advantage.
4. Unique Observations
- Per-token pricing (SambaNova Cloud, 2026-05): Llama 3.1 405B ~$5/M input + $10/M output (early pricing on the high side, converging with peers); Llama 3.3 70B ~$0.60/M blended; DeepSeek V3 ~$1/M blended; Llama 3.1 8B ~$0.10/M [1]. The 405B speed selling point of ~200+ tok/s is SambaNova Cloud's banner (under GPU systems 405B serverless is usually <30 tok/s).
- vs first-party price gap: Llama 3.1 405B SambaNova ~$7.5/M blended vs GPT-4o ~$10/M blended — the price gap is small. But SambaNova's selling point isn't cheap, it's "same price point gets you a very-large open-source model + very fast speeds".
- vs speed peers: compared to groq LPU 70B speed, SambaNova's true niche is at 405B — Groq running 405B needs more LPUs and capacity is tight; SambaNova's system structure is better suited for very-large-model single-instance.
- Inference engine: entirely proprietary, closed-source. The RDU compiler + software stack is the company's moat. Any new model architecture (a new attention variant, MoE topology) needs the SambaNova software team to adapt — speed can't keep up with vLLM / SGLang.
- Capital / compute model: self-developed + has fab partners (GlobalFoundries), fabless path; the "NVIDIA price increase" pressure GPU players face is transformed into the different form of "in-house chip development cycle + tape-out cost".
- Take rate: data center = own RDU rack; token sale price - amortized depreciation = gross margin. Early RDU per-unit compute cost is higher than H100 (low volume), but doesn't pay NVIDIA's 60% gross margin tax. Needs cloud business volume to come up to amortize.
- Strategic dilemma: high valuation ($5B post-2021), but long enterprise sales cycles + cloud business chasing Groq; multiple rounds of layoffs / restructuring in 2024-2026 reflect the "sell systems → sell tokens" pivot pains.
5. Financials / Funding
| Round | Date | Amount | Valuation | Lead |
|---|---|---|---|---|
| Series A-C | 2017-2020 | ~$450M cumulative | — | Walden, GV, Intel Capital |
| Series D | 2021-04 | $678M | $5.1B post | SoftBank Vision Fund 2 |
- Founded: 2017 (Stanford / Sun Microsystems alumni)
- Total raised: ~$1.1B+
- Customers: US Department of Energy (Argonne, LLNL), Saudi scientific institutions, multiple top banks (few public cases)
- Reports: multiple rounds of layoffs 2023-2024; SambaNova Cloud is the key reposition in 2024
6. People & Relationships
- Co-founders: Rodrigo Liang (CEO, ex-Oracle / Sun), Kunle Olukotun (Stanford professor, early multi-core CPU pioneer), Christopher Ré (Stanford professor, ML systems / Snorkel co-founder).
- Investors: SoftBank Vision Fund 2, BlackRock, Intel Capital, Google Ventures, Walden International, Atlantic Bridge, Celesta Capital.
- Customers: Argonne National Lab, Lawrence Livermore National Lab, Saudi Aramco, Saudi public cloud projects, Analog Devices.
- Competes with: groq, cerebras, NVIDIA, fireworks-ai, together-ai.
Sources
- [1] https://sambanova.ai/pricing (2026-05-10)
- [2] https://sambanova.ai/blog/sambanova-cloud (2026-05-10)
- [3] https://www.crunchbase.com/organization/sambanova-systems (2026-05-10)
- [4] https://artificialanalysis.ai/providers/sambanova (2026-05-10)
Last compiled: 2026-05-10