AWS
Largest cloud by revenue and the most diversified AI compute platform — runs both NVIDIA H100/H200/B200 fleets and the proprietary Trainium2 silicon backing Anthropic's training stack.
1. Core Product / Service
Amazon Web Services (AWS) is Amazon's cloud business unit; the AI-relevant slice is the EC2 GPU/accelerator family + the managed-model layer (Bedrock, SageMaker).
Compute SKUs that matter for AI:
- EC2 P5 / P5en — 8× NVIDIA H100 (P5) or 8× H200 (P5en), 3.2 Tbps EFA networking, deployed in UltraClusters of >20,000 GPUs [1].
- EC2 P6 (Blackwell) — 8× NVIDIA B200, announced 2025; rolling out into UltraClusters through 2026.
- EC2 Trn2 / Trn2 UltraServer — 16× Trainium2 chips per instance; UltraServer links 4 Trn2 nodes (64 chips) over NeuronLink fabric for frontier training [7].
- EC2 Inf2 — Inferentia2-based, inference-optimized.
- Amazon Bedrock — managed multi-model API (Anthropic Claude, Llama, Mistral, Amazon Nova, Stability) sitting on top of the same hardware.
- SageMaker HyperPod — orchestration for multi-thousand-accelerator training clusters.
Capacity is a deliberate dual track: NVIDIA for breadth-of-customer demand, Trainium for cost-per-token leverage on the anthropic anchor workload (Project Rainier — ~400k Trainium2 chips dedicated to Anthropic) [6].
2. Target Users & Pain Points
- Frontier AI labs — Anthropic is the marquee customer (training Claude on Trainium2; inference partly on Bedrock).
- Enterprise AI buyers who want a Claude/Llama endpoint inside their existing AWS perimeter (VPC, IAM, audit) → Bedrock.
- AI startups with AWS credits / Activate program scaling onto P5/P5en clusters.
- Inference-heavy SaaS that wants Inferentia2 for cost/token at moderate quality.
Pain points addressed: integrated identity/networking with the rest of AWS, Trainium-cheaper-than-NVIDIA for the few customers who can co-design (Anthropic), and capacity reservations via Capacity Blocks [2].
3. Competitive Landscape
| Provider | Differentiation vs AWS |
|---|---|
| microsoft-azure | OpenAI-exclusive distribution; deeper enterprise + GitHub integration |
| google-cloud | TPU silicon (4× generations ahead in custom ASIC), Anthropic also a customer |
| oracle-cloud | Aggressive GPU pricing, Stargate / OpenAI mega-contracts |
| coreweave | Pure-play neocloud, often cheaper raw H100/H200 |
| nebius | EU-domiciled neocloud, NVIDIA-aligned |
AWS's edge: scale (largest installed base of enterprise tenants), Anthropic depth, Trainium custom silicon. Disadvantage: late on frontier-model story (Nova underperforms Claude/GPT/Gemini), and Bedrock is a wrapper around partner models more than a 1P play.
4. Unique Observations
- GPU pricing (Capacity Blocks, on-demand reservation): H100 P5
$98.32/instance·hour list ($12.29/H100·hour) for 1-week reservations under Capacity Blocks; on-demand list $98.32/instance·hour reaches ~$31–$40 per H100·hour at low utilization but real customers transact via Savings Plans / Reserved at $3–$6/H100·hour 1–3 year [2]. H200 P5en list ~$84.77/H200·hour Capacity Block; B200 P6 not yet on public pricing as of 2026-05 [1][2]. Effective rates are sharply below list — frontier customers (Anthropic, Stability) negotiate multi-year commits at deep discounts not publicly disclosed. - Capacity source mix is 70/30 NVIDIA/Trainium by spend (estimate; AWS does not disclose). Trainium2 share is rising fast on the back of Project Rainier — Anthropic's ~400k-Trainium2 cluster came online late 2025 [6]. Rainier is the largest non-NVIDIA AI training cluster in the world.
- AI revenue share — AWS doesn't break it out. Q1 2026 AWS revenue $29.3B (+17% YoY), full-year 2025 $107.6B; CEO commentary frames "AI business" as multi-billion-dollar run-rate growing triple-digit % YoY but the absolute number is undisclosed [5]. Analyst triangulation puts AI-attributable AWS revenue in the $8–$12B annualized range — small share of total but the entire growth story.
- Customer concentration disclosed = Anthropic. Amazon has invested $8B cumulative in Anthropic ($4B Sep 2023 + $4B Nov 2024) with Trainium as the explicit quid pro quo: Anthropic uses AWS as primary training partner [3][4]. Anthropic in return is the anchor that justifies Trainium2's existence — without that workload, the chip program's economics break. Other major AI labs use AWS opportunistically (cohere, AI21, Mistral on Bedrock) but Anthropic is the single load-bearing relationship. Compare to microsoft-azure's OpenAI dependency — same pattern, different lab.
- The "Bedrock is captive distribution" point: every frontier model on Bedrock pays AWS a margin without AWS taking any model-development risk. Mirror of Microsoft's OpenAI play but multi-model. Relevant to the L2→L3b vertical-integration row in ai-inference-engines.
5. Financials / Funding
- Parent: Amazon (NASDAQ: AMZN); AWS is a reportable segment.
- Q1 2026 AWS revenue: $29.3B (+17% YoY); operating income $11.5B (39.5% margin) [5].
- FY2025 AWS revenue: $107.6B (+19% YoY) [5].
- AI-attributable revenue: not disclosed; CEO Andy Jassy described AI as a "multi-billion-dollar revenue annualized business growing triple-digit-percent" in earnings calls (2025 Q4, 2026 Q1) [5].
- Capex: Amazon-wide $100B+ guidance for 2025; "vast majority" AI/AWS-related per Jassy.
- Anthropic investment: $8B cumulative ($4B Sep-2023 + $4B Nov-2024) [4].
- Project Rainier (Anthropic Trainium2 cluster): ~400k Trainium2 chips, online Dec 2025 [6].
6. People & Relationships
- AWS CEO: Matt Garman (since June 2024); previously SVP of Sales & Marketing.
- Amazon CEO: Andy Jassy (former AWS CEO; AI strategy primary owner).
- Anchor AI partner: Anthropic ($8B invested; primary training customer; Project Rainier).
- NVIDIA: critical supplier for P5/P5en/P6; also runs DGX Cloud reservations on AWS.
- Other AI partners on Bedrock: Mistral, Cohere, AI21, Stability, Meta (Llama), Anthropic, Amazon Nova (1P).
- Competitors: microsoft-azure, google-cloud, oracle-cloud, coreweave.
Sources
[1] https://aws.amazon.com/ec2/instance-types/p5/ (2026-05-10) [2] https://aws.amazon.com/ec2/capacityblocks/pricing/ (2026-05-10) [3] https://www.anthropic.com/news/anthropic-amazon-trainium (2026-05-10) [4] https://www.cnbc.com/2024/11/22/amazon-to-invest-another-4-billion-in-anthropic-openai-rival.html (2026-05-10) [5] https://ir.aboutamazon.com/news-release/news-release-details/2026/Amazon.com-Announces-First-Quarter-Results/default.aspx (2026-05-10) [6] https://www.aboutamazon.com/news/aws/aws-trainium2-ultraserver-anthropic-project-rainier (2026-05-10) [7] https://aws.amazon.com/blogs/aws/announcing-amazon-ec2-trn2-instances-and-trn2-ultraservers-for-aws-trainium2/ (2026-05-10)