Company

SambaNova Systems

Proprietary RDU (Reconfigurable Dataflow Unit) + full-stack AI system; a core player in the original enterprise / government market, pivoting to open external token APIs from 2024.

1. Core Product / Service

SambaNova is an AI systems company vertically integrated down to the chip, with three product layers:

  • SN40L RDU: second-generation reconfigurable dataflow chip (HBM3 + three-tier memory hierarchy); the strength is fitting very large models (Llama 405B class) on a single system without massive model parallelism — minimal sharding leads to low latency.
  • DataScale systems / SambaNova Suite: bundles RDU + software + pre-trained models to sell to enterprises / governments, typically in the form of an "AI-as-a-Service" integrated contract.
  • SambaNova Cloud (2024 launch): externally-open token API, serverless billing — Llama 3.1 405B, Llama 3.3 70B, DeepSeek V3 / R1, Qwen2.5, Whisper [2]. This is SambaNova's key product pivot from "selling enterprise systems" to "selling tokens".

2. Target Users & Pain Points

  • Government / sovereign AI / regulated industries: federal agencies, banks, healthcare — needing on-prem, needing isolation, not accepting cloud-only. SambaNova's long-standing strength.
  • Enterprise model fine-tuning + deployment in one: buying SambaNova gets you chip + software + a fine-tuned open-source model suite — simplifying procurement.
  • Pain point: traditional NVIDIA path needs separate GPU purchases + self-built inference stack + ops; SambaNova is "turn-key AI rack", but also has the strongest lock-in (proprietary SDK).
  • SambaNova Cloud pain-point positioning: using the same RDU systems to externally sell tokens, providing a sampling entry for developers — but whether it can long-term compete with Groq / Cerebras on speed and pricing is still to be observed.

3. Competitive Landscape

Competitor Positioning Vs. SambaNova
groq LPU + speed narrative Both non-GPU; Groq goes developer cloud, SambaNova goes enterprise / government
cerebras Wafer-scale WSE-3 Both non-mainstream hardware paths; Cerebras is more aggressive on single-chip, SambaNova's system integration is more mature
NVIDIA + DGX Cloud GPU + ecosystem NVIDIA's general purpose crushes; SambaNova differentiates on "full-stack integrated contracts"
fireworks-ai / together-ai GPU software stack SambaNova has a structural advantage on single-stream latency for very large 405B-class models
AWS Bedrock / Azure Cloud-vendor hosted Clouds win on distribution; SambaNova wins on private deployment

Differentiation: RDU design + three-tier memory + full-stack packaging. On Llama 405B-class very-large-model single-stream inference, low sharding → low latency is a physical advantage.

4. Unique Observations

  • Per-token pricing (SambaNova Cloud, 2026-05): Llama 3.1 405B ~$5/M input + $10/M output (early pricing on the high side, converging with peers); Llama 3.3 70B ~$0.60/M blended; DeepSeek V3 ~$1/M blended; Llama 3.1 8B ~$0.10/M [1]. The 405B speed selling point of ~200+ tok/s is SambaNova Cloud's banner (under GPU systems 405B serverless is usually <30 tok/s).
  • vs first-party price gap: Llama 3.1 405B SambaNova ~$7.5/M blended vs GPT-4o ~$10/M blended — the price gap is small. But SambaNova's selling point isn't cheap, it's "same price point gets you a very-large open-source model + very fast speeds".
  • vs speed peers: compared to groq LPU 70B speed, SambaNova's true niche is at 405B — Groq running 405B needs more LPUs and capacity is tight; SambaNova's system structure is better suited for very-large-model single-instance.
  • Inference engine: entirely proprietary, closed-source. The RDU compiler + software stack is the company's moat. Any new model architecture (a new attention variant, MoE topology) needs the SambaNova software team to adapt — speed can't keep up with vLLM / SGLang.
  • Capital / compute model: self-developed + has fab partners (GlobalFoundries), fabless path; the "NVIDIA price increase" pressure GPU players face is transformed into the different form of "in-house chip development cycle + tape-out cost".
  • Take rate: data center = own RDU rack; token sale price - amortized depreciation = gross margin. Early RDU per-unit compute cost is higher than H100 (low volume), but doesn't pay NVIDIA's 60% gross margin tax. Needs cloud business volume to come up to amortize.
  • Strategic dilemma: high valuation ($5B post-2021), but long enterprise sales cycles + cloud business chasing Groq; multiple rounds of layoffs / restructuring in 2024-2026 reflect the "sell systems → sell tokens" pivot pains.

5. Financials / Funding

Round Date Amount Valuation Lead
Series A-C 2017-2020 ~$450M cumulative Walden, GV, Intel Capital
Series D 2021-04 $678M $5.1B post SoftBank Vision Fund 2
  • Founded: 2017 (Stanford / Sun Microsystems alumni)
  • Total raised: ~$1.1B+
  • Customers: US Department of Energy (Argonne, LLNL), Saudi scientific institutions, multiple top banks (few public cases)
  • Reports: multiple rounds of layoffs 2023-2024; SambaNova Cloud is the key reposition in 2024

6. People & Relationships

  • Co-founders: Rodrigo Liang (CEO, ex-Oracle / Sun), Kunle Olukotun (Stanford professor, early multi-core CPU pioneer), Christopher Ré (Stanford professor, ML systems / Snorkel co-founder).
  • Investors: SoftBank Vision Fund 2, BlackRock, Intel Capital, Google Ventures, Walden International, Atlantic Bridge, Celesta Capital.
  • Customers: Argonne National Lab, Lawrence Livermore National Lab, Saudi Aramco, Saudi public cloud projects, Analog Devices.
  • Competes with: groq, cerebras, NVIDIA, fireworks-ai, together-ai.

Sources

Last compiled: 2026-05-10