SambaNova Systems

Proprietary RDU (Reconfigurable Dataflow Unit) + full-stack AI system; a core player in the original enterprise / government market, pivoting to open external token APIs from 2024.

1. Core Product / Service

SambaNova is an AI systems company vertically integrated down to the chip, with three product layers:

SN40L RDU: second-generation reconfigurable dataflow chip (HBM3 + three-tier memory hierarchy); the strength is fitting very large models (Llama 405B class) on a single system without massive model parallelism — minimal sharding leads to low latency.
DataScale systems / SambaNova Suite: bundles RDU + software + pre-trained models to sell to enterprises / governments, typically in the form of an "AI-as-a-Service" integrated contract.
SambaNova Cloud (2024 launch): externally-open token API, serverless billing — Llama 3.1 405B, Llama 3.3 70B, DeepSeek V3 / R1, Qwen2.5, Whisper [2]. This is SambaNova's key product pivot from "selling enterprise systems" to "selling tokens".

2. Target Users & Pain Points

Government / sovereign AI / regulated industries: federal agencies, banks, healthcare — needing on-prem, needing isolation, not accepting cloud-only. SambaNova's long-standing strength.
Enterprise model fine-tuning + deployment in one: buying SambaNova gets you chip + software + a fine-tuned open-source model suite — simplifying procurement.
Pain point: traditional NVIDIA path needs separate GPU purchases + self-built inference stack + ops; SambaNova is "turn-key AI rack", but also has the strongest lock-in (proprietary SDK).
SambaNova Cloud pain-point positioning: using the same RDU systems to externally sell tokens, providing a sampling entry for developers — but whether it can long-term compete with Groq / Cerebras on speed and pricing is still to be observed.

3. Competitive Landscape

Competitor	Positioning	Vs. SambaNova
groq	LPU + speed narrative	Both non-GPU; Groq goes developer cloud, SambaNova goes enterprise / government
cerebras	Wafer-scale WSE-3	Both non-mainstream hardware paths; Cerebras is more aggressive on single-chip, SambaNova's system integration is more mature
NVIDIA + DGX Cloud	GPU + ecosystem	NVIDIA's general purpose crushes; SambaNova differentiates on "full-stack integrated contracts"
fireworks-ai / together-ai	GPU software stack	SambaNova has a structural advantage on single-stream latency for very large 405B-class models
AWS Bedrock / Azure	Cloud-vendor hosted	Clouds win on distribution; SambaNova wins on private deployment

Differentiation: RDU design + three-tier memory + full-stack packaging. On Llama 405B-class very-large-model single-stream inference, low sharding → low latency is a physical advantage.

4. Unique Observations

Per-token pricing (SambaNova Cloud, 2026-05): Llama 3.1 405B ~$5/M input + $10/M output (early pricing on the high side, converging with peers); Llama 3.3 70B ~$0.60/M blended; DeepSeek V3 ~$1/M blended; Llama 3.1 8B ~$0.10/M [1]. The 405B speed selling point of ~200+ tok/s is SambaNova Cloud's banner (under GPU systems 405B serverless is usually <30 tok/s).
vs first-party price gap: Llama 3.1 405B SambaNova ~$7.5/M blended vs GPT-4o ~$10/M blended — the price gap is small. But SambaNova's selling point isn't cheap, it's "same price point gets you a very-large open-source model + very fast speeds".
vs speed peers: compared to groq LPU 70B speed, SambaNova's true niche is at 405B — Groq running 405B needs more LPUs and capacity is tight; SambaNova's system structure is better suited for very-large-model single-instance.
Inference engine: entirely proprietary, closed-source. The RDU compiler + software stack is the company's moat. Any new model architecture (a new attention variant, MoE topology) needs the SambaNova software team to adapt — speed can't keep up with vLLM / SGLang.
Capital / compute model: self-developed + has fab partners (GlobalFoundries), fabless path; the "NVIDIA price increase" pressure GPU players face is transformed into the different form of "in-house chip development cycle + tape-out cost".
Take rate: data center = own RDU rack; token sale price - amortized depreciation = gross margin. Early RDU per-unit compute cost is higher than H100 (low volume), but doesn't pay NVIDIA's 60% gross margin tax. Needs cloud business volume to come up to amortize.
Strategic dilemma: high valuation ($5B post-2021), but long enterprise sales cycles + cloud business chasing Groq; multiple rounds of layoffs / restructuring in 2024-2026 reflect the "sell systems → sell tokens" pivot pains.

5. Financials / Funding

Round	Date	Amount	Valuation	Lead
Series A-C	2017-2020	~$450M cumulative	—	Walden, GV, Intel Capital
Series D	2021-04	$678M	$5.1B post	SoftBank Vision Fund 2

Founded: 2017 (Stanford / Sun Microsystems alumni)
Total raised: ~$1.1B+
Customers: US Department of Energy (Argonne, LLNL), Saudi scientific institutions, multiple top banks (few public cases)
Reports: multiple rounds of layoffs 2023-2024; SambaNova Cloud is the key reposition in 2024

6. People & Relationships

Co-founders: Rodrigo Liang (CEO, ex-Oracle / Sun), Kunle Olukotun (Stanford professor, early multi-core CPU pioneer), Christopher Ré (Stanford professor, ML systems / Snorkel co-founder).
Investors: SoftBank Vision Fund 2, BlackRock, Intel Capital, Google Ventures, Walden International, Atlantic Bridge, Celesta Capital.
Customers: Argonne National Lab, Lawrence Livermore National Lab, Saudi Aramco, Saudi public cloud projects, Analog Devices.
Competes with: groq, cerebras, NVIDIA, fireworks-ai, together-ai.

Sources

[1] https://sambanova.ai/pricing (2026-05-10)
[2] https://sambanova.ai/blog/sambanova-cloud (2026-05-10)
[3] https://www.crunchbase.com/organization/sambanova-systems (2026-05-10)
[4] https://artificialanalysis.ai/providers/sambanova (2026-05-10)