Company
NVIDIA
The de-facto monopoly of AI compute — H100/B200 silicon plus the CUDA software moat that every alternative chip is benchmarked against.
1. Core Product / Service
NVIDIA designs the GPUs and surrounding stack that train and serve essentially every frontier AI model. Active SKUs in 2026:
- Hopper (H100 / H200) — TSMC 4N, 80–141 GB HBM. H100 still ~700–1,000 TFLOPS FP16 [1]; the workhorse of 2023–2025 builds, now in cloud price decline.
- Blackwell (B200 / GB200 NVL72) — TSMC 4NP dual-die, 192 GB HBM3e, ~10 PFLOPS FP4 per chip; rack-scale GB200 NVL72 packages 72 GPUs + 36 Grace CPUs over NVLink [1].
- Blackwell Ultra (B300) and the next Rubin generation are on the roadmap; Rubin is expected on TSMC N3.
Beyond silicon, NVIDIA is steadily integrating up-stack: CUDA + cuDNN + TensorRT-LLM (kernel layer, see gpu-kernel-optimization); NIM inference microservices; DGX Cloud managed clusters rented through hyperscalers; NVLink / Spectrum-X / Quantum networking; Omniverse / Isaac / Drive vertical stacks. The hardware is now a pretext to sell the platform.
2. Target Users & Pain Points
Three buyer tiers:
- Hyperscalers + frontier labs (Microsoft, Meta, Google, Amazon, OpenAI, Anthropic, xAI) — buy GB200 racks by the gigawatt; pain solved is supply, not selection.
- Neoclouds (coreweave lambda-labs nebius runpod) — get early allocation in exchange for capex commitment; their entire business is reselling NVIDIA hours.
- Enterprises / sovereigns — buy DGX systems or DGX Cloud; pain solved is "we want AI infra without owning a fab-out supply chain."
Allocation, not list price, is the real product. The waitlist is the moat.
3. Competitive Landscape
| Vendor | 2026 share of DC AI accel revenue | Strength | Weakness |
|---|---|---|---|
| NVIDIA | ~80–90% | CUDA, supply, generality, network | Margin under political pressure; captive-chip incumbents nibble |
| amd | ~5–7% | MI300X/MI355X HBM advantage, ROCm improving | Software gap, smaller fleet |
| google-tpu | captive + Anthropic | Best perf/$ on Google's stack | Captive to GCP |
| aws-trainium | captive + Anthropic | Cheap silicon for one customer | Software immature |
| microsoft-maia | captive | Co-designed with OpenAI | Maia 100 never shipped externally |
| cerebras / Groq / SambaNova | <1% | Wafer-scale / LPU streaming inference | Niche |
4. Unique Observations
- B200 unit economics are obscene. Estimated COGS ~$6,400 vs ASP ~$40,000 — ~84% gross margin per unit; H100 ASP $25–40K against ~$3,320 COGS — ~88% gross margin [2]. HBM is now ~45% of B200 COGS (vs 41% on H100), which is why HBM3e supply effectively gates how many Blackwells exist.
- CUDA is a moat that compounds. Every major framework optimization (FlashAttention, vLLM, SGLang) ships CUDA-first; ROCm trails by 6–18 months; captive chips (TPU/Trainium/Maia) require model rewrites their owners are willing to absorb but the open market is not. See gpu-kernel-optimization and ai-inference-engines.
- TSMC capacity is the actual bottleneck — not the GPU itself but the tsmc CoWoS advanced-packaging line. CoWoS sold out through 2026; capacity ramping ~35K wafers/mo (late 2024) → ~130K wafers/mo (late 2026) [6]. A B200 without CoWoS is a die that can't be packaged with HBM. Industry-wide demand exceeds supply by 1.4–1.6x [3].
- Move-up via NIM and DGX Cloud is the threat to its own customers. NIM packages CUDA + TensorRT-LLM + a model into a deployable container; DGX Cloud rents NVIDIA-built clusters through Azure/GCP/Oracle. NVIDIA explicitly stops short of competing with hyperscalers on raw IaaS — but NIM lets it tax the L3 inference layer directly. Neoclouds like coreweave are one product launch away from being disintermediated upward.
- Concentration on top 4 customers. Microsoft, Meta, Google, Amazon plus OpenAI/xAI/Anthropic effectively dictate roadmap. Hyperscalers building captive silicon (google-tpu aws-trainium microsoft-maia) is the slow-moving structural risk that NVIDIA is pricing in via NIM/DGX moves.
5. Financials / Funding
- Q1 FY2026 revenue: $44.1B total, data center $39.1B (+69% YoY) [1]
- Full-year guidance implies a ~$200B+ run rate; FY2026 outlook ~$78B/quarter trajectory [1]
- Gross margin: ~70%+ corporate, ~75–80% on data center products
- Market cap: ~$3.5T+ range (mid-2026), among the largest publicly traded companies globally
- Capex (NVIDIA's spend, not customers'): minimal — fabless model, asset-light
- Strategic stakes taken: ~13% of coreweave post Jan-2026 $2B investment; smaller stakes in Nebius and others — NVIDIA is becoming an LP in its own customers
6. People & Relationships
- CEO: Jensen Huang (co-founder, since 1993)
- CFO: Colette Kress
- Head of Enterprise / DC: Ian Buck (CUDA originator), Jay Puri (worldwide field ops)
- Foundry partner: tsmc (4N → 4NP → N3 / N2)
- HBM suppliers: SK hynix (lead), Samsung, Micron
- Top customers: Microsoft, Meta, Google, Amazon, OpenAI (via Microsoft + direct), xAI, Anthropic, Oracle
- Captive-cloud partners: coreweave (~13% stake), lambda-labs, nebius, runpod, together-ai
- Direct competitors / alternatives: amd intel huawei-ascend google-tpu aws-trainium aws-inferentia microsoft-maia cerebras
Last compiled: 2026-05-10