NVIDIA

The de-facto monopoly of AI compute — H100/B200 silicon plus the CUDA software moat that every alternative chip is benchmarked against.

1. Core Product / Service

NVIDIA designs the GPUs and surrounding stack that train and serve essentially every frontier AI model. Active SKUs in 2026:

Hopper (H100 / H200) — TSMC 4N, 80–141 GB HBM. H100 still ~700–1,000 TFLOPS FP16 [1]; the workhorse of 2023–2025 builds, now in cloud price decline.
Blackwell (B200 / GB200 NVL72) — TSMC 4NP dual-die, 192 GB HBM3e, ~10 PFLOPS FP4 per chip; rack-scale GB200 NVL72 packages 72 GPUs + 36 Grace CPUs over NVLink [1].
Blackwell Ultra (B300) and the next Rubin generation are on the roadmap; Rubin is expected on TSMC N3.

Beyond silicon, NVIDIA is steadily integrating up-stack: CUDA + cuDNN + TensorRT-LLM (kernel layer, see gpu-kernel-optimization); NIM inference microservices; DGX Cloud managed clusters rented through hyperscalers; NVLink / Spectrum-X / Quantum networking; Omniverse / Isaac / Drive vertical stacks. The hardware is now a pretext to sell the platform.

2. Target Users & Pain Points

Three buyer tiers:

Hyperscalers + frontier labs (Microsoft, Meta, Google, Amazon, OpenAI, Anthropic, xAI) — buy GB200 racks by the gigawatt; pain solved is supply, not selection.
Neoclouds (coreweave lambda-labs nebius runpod) — get early allocation in exchange for capex commitment; their entire business is reselling NVIDIA hours.
Enterprises / sovereigns — buy DGX systems or DGX Cloud; pain solved is "we want AI infra without owning a fab-out supply chain."

Allocation, not list price, is the real product. The waitlist is the moat.

3. Competitive Landscape

Vendor	2026 share of DC AI accel revenue	Strength	Weakness
NVIDIA	~80–90%	CUDA, supply, generality, network	Margin under political pressure; captive-chip incumbents nibble
amd	~5–7%	MI300X/MI355X HBM advantage, ROCm improving	Software gap, smaller fleet
google-tpu	captive + Anthropic	Best perf/$ on Google's stack	Captive to GCP
aws-trainium	captive + Anthropic	Cheap silicon for one customer	Software immature
microsoft-maia	captive	Co-designed with OpenAI	Maia 100 never shipped externally
cerebras / Groq / SambaNova	<1%	Wafer-scale / LPU streaming inference	Niche

4. Unique Observations

B200 unit economics are obscene. Estimated COGS ~$6,400 vs ASP ~$40,000 — ~84% gross margin per unit; H100 ASP $25–40K against ~$3,320 COGS — ~88% gross margin [2]. HBM is now ~45% of B200 COGS (vs 41% on H100), which is why HBM3e supply effectively gates how many Blackwells exist.
CUDA is a moat that compounds. Every major framework optimization (FlashAttention, vLLM, SGLang) ships CUDA-first; ROCm trails by 6–18 months; captive chips (TPU/Trainium/Maia) require model rewrites their owners are willing to absorb but the open market is not. See gpu-kernel-optimization and ai-inference-engines.
TSMC capacity is the actual bottleneck — not the GPU itself but the tsmc CoWoS advanced-packaging line. CoWoS sold out through 2026; capacity ramping ~35K wafers/mo (late 2024) → ~130K wafers/mo (late 2026) [6]. A B200 without CoWoS is a die that can't be packaged with HBM. Industry-wide demand exceeds supply by 1.4–1.6x [3].
Move-up via NIM and DGX Cloud is the threat to its own customers. NIM packages CUDA + TensorRT-LLM + a model into a deployable container; DGX Cloud rents NVIDIA-built clusters through Azure/GCP/Oracle. NVIDIA explicitly stops short of competing with hyperscalers on raw IaaS — but NIM lets it tax the L3 inference layer directly. Neoclouds like coreweave are one product launch away from being disintermediated upward.
Concentration on top 4 customers. Microsoft, Meta, Google, Amazon plus OpenAI/xAI/Anthropic effectively dictate roadmap. Hyperscalers building captive silicon (google-tpu aws-trainium microsoft-maia) is the slow-moving structural risk that NVIDIA is pricing in via NIM/DGX moves.

5. Financials / Funding

Q1 FY2026 revenue: $44.1B total, data center $39.1B (+69% YoY) [1]
Full-year guidance implies a ~$200B+ run rate; FY2026 outlook ~$78B/quarter trajectory [1]
Gross margin: ~70%+ corporate, ~75–80% on data center products
Market cap: ~$3.5T+ range (mid-2026), among the largest publicly traded companies globally
Capex (NVIDIA's spend, not customers'): minimal — fabless model, asset-light
Strategic stakes taken: ~13% of coreweave post Jan-2026 $2B investment; smaller stakes in Nebius and others — NVIDIA is becoming an LP in its own customers

6. People & Relationships

CEO: Jensen Huang (co-founder, since 1993)
CFO: Colette Kress
Head of Enterprise / DC: Ian Buck (CUDA originator), Jay Puri (worldwide field ops)
Foundry partner: tsmc (4N → 4NP → N3 / N2)
HBM suppliers: SK hynix (lead), Samsung, Micron
Top customers: Microsoft, Meta, Google, Amazon, OpenAI (via Microsoft + direct), xAI, Anthropic, Oracle
Captive-cloud partners: coreweave (~13% stake), lambda-labs, nebius, runpod, together-ai
Direct competitors / alternatives: amd intel huawei-ascend google-tpu aws-trainium aws-inferentia microsoft-maia cerebras