Company

Lambda Labs

The "Superintelligence Cloud" — DL-native GPU cloud + on-prem AI servers, founded 2012 by the Balaban brothers, now a top-tier neocloud rivaling coreweave and nebius.

1. Core Product / Service

Three product lines wrapping a single bet on NVIDIA-dense AI infrastructure:

  • Lambda Cloud (on-demand & reserved GPU instances) — per-minute billing, no egress fees, on-demand only (no spot). Catalog spans NVIDIA B200, H100/H200 SXM, GH200, A100, A6000, and legacy V100/A10 [1].
  • 1-Click Clusters — production-ready InfiniBand-fabric clusters from 16 to 2,000+ B200 or H100 GPUs, self-serve provisioning. Headline rates B200 $8.87–$9.86/GPU-hr, H100 $5.54–$6.16/GPU-hr [1].
  • Lambda Hyperplane (on-prem servers) — 4×/8× H100 SXM5 HGX boxes with NVLink+NVSwitch, AMD EPYC 9004-series CPUs, 256GB–8TB DDR5, up to 32 petaFLOPS FP8. Ships with Lambda Stack (CUDA, cuDNN, PyTorch, TensorFlow pre-installed). Scales into multi-node Lambda Echelon clusters via Mellanox InfiniBand [2].
  • Lambda Chat / inference — hosts DeepSeek-R1 and other OSS models; signals a push from training-only into ai-inference-engines territory.

Differentiator vs. generic IaaS: opinionated DL software stack and tight NVIDIA integration (early-access partner for H200 / Blackwell).

2. Target Users & Pain Points

  • AI research labs & frontier-model startups training 7B–400B parameter models — need long-horizon reserved B200/H100 capacity with InfiniBand, can't get fast allocation from AWS/GCP.
  • ML engineers & solo researchers spinning up a single 8×H100 box for a week — Lambda's per-minute billing + pre-baked CUDA/PyTorch image removes a day of yak-shaving vs. raw EC2.
  • Enterprise on-prem buyers (regulated industries, gov-adjacent) — buy Hyperplane servers outright instead of renting; In-Q-Tel is on the cap table, signaling defense/intel use cases.

Pains addressed: hyperscaler GPU scarcity, slow spin-up, surprise egress bills, environment-setup tax.

3. Competitive Landscape

Provider On-demand H100/hr Strengths Weaknesses
Lambda ~$2.99–$4.29 (varies SKU/cluster tier) [1] DL-native stack, no egress, on-prem + cloud, strong NVIDIA ties On-demand only (no spot), pricier than marketplaces
coreweave ~$2.39–$4.76 Public co., gigawatt scale, hyperscaler-grade SLAs Enterprise-skewed, less self-serve
runpod ~$1.99–$2.99 (community/secure mix) Per-second billing, serverless GPU, dev-friendly Less enterprise, smaller fleet
nebius ~$2.00–$3.50 EU footprint, public co. (Yandex spin-out) Newer brand in US
vast-ai ~$1.49–$2.50 Cheapest marketplace Reliability variance, no enterprise SLA
together-ai n/a (inference-as-a-service) Tuned inference for OSS LLMs Not raw GPU rental
AWS/GCP/Azure $4.00–$12.00+ Integrated cloud, compliance Allocation gated, egress fees, slow

Lambda's wedge is the "premium neocloud" middle: cheaper and faster than hyperscalers, more reliable and more software-bundled than vast-ai / runpod.

4. Unique Observations

  • Pivot history: started 2012 as facial-recognition startup; pivoted to GPU servers when DL exploded. The DL-stack DNA (Lambda Stack is downloaded by hundreds of thousands of researchers) is the cheapest distribution channel any neocloud has — coreweave had to buy reach via deals; Lambda already had it.
  • Capital structure asymmetry: $480M Series D (Feb 2025) at ~$2.5B valuation, then $1.5B+ Series E (Nov 2025) led by TWG Global + USIT. Combined ~$2B+ in <12 months — only coreweave (public) raised more in the neocloud cohort.
  • Strategic investor mix is unusual: NVIDIA + In-Q-Tel + Karpathy + ARK + supply-chain players (Pegatron, Supermicro, Wistron, Wiwynn). The OEMs-as-investors angle gives Lambda preferential allocation on B200/Blackwell systems — a hard moat in a supply-constrained market.
  • CEO change May 2026: Michel Combes (telco veteran, ex-Sprint/Altice) brought in as CEO; founder Stephen Balaban moved to CTO, Michael Balaban to CPO. Signals shift from founder-mode to gigawatt-scale ops execution — same playbook coreweave ran pre-IPO.
  • No spot tier is a deliberate choice — premium positioning, but cedes the price-sensitive long tail to runpod / vast-ai. See runpod-gpu-inference for that tier's economics.

5. Financials / Funding

Round Date Amount Lead(s) Valuation
Series C 2024 $320M Thomas Tull / USIT ~$1.5B
Series D Feb 2025 $480M Andra Capital, SGW ~$2.5B [3]
Series E Nov 2025 $1.5B+ TWG Global, USIT undisclosed (rumored ~$4B+) [4]

Notable Series D/E investors: NVIDIA, In-Q-Tel, Andrej Karpathy, ARK Invest, G Squared, Pegatron, Supermicro, Wistron, Wiwynn, Crescent Cove, 1517 [3][4].

Footprint: 25,000+ NVIDIA GPUs deployed across owned/leased data centers (as of Series D announcement); Series E earmarked for gigawatt-scale buildout [3][4].

6. People & Relationships

  • Stephen Balaban — co-founder, CTO (May 2026–). Ex-CEO. Public face on LinkedIn / NVIDIA GTC.
  • Michael Balaban — co-founder, Chief Product Officer (May 2026–).
  • Michel Combes — CEO (May 2026–), ex-Sprint, ex-Altice.
  • John Donovan — Chairman, ex-CEO AT&T Communications.
  • Investors / strategic: NVIDIA, In-Q-Tel, TWG Global, USIT, ARK Invest, Andrej Karpathy.
  • Adjacent: competes with coreweave, nebius, runpod, vast-ai; complementary to inference-layer players like together-ai and serves customers who also use ai-inference-engines frameworks on top.

Sources

Last compiled: 2026-05-09