Microsoft Maia 100/200

Microsoft's third-place captive silicon — Maia 100 was a no-show, Maia 200 is the inference chip co-designed with OpenAI that finally ships.

1. Core Product / Service

Maia is Microsoft's custom AI accelerator family, built in-house with deep input from OpenAI. Two generations exist on paper:

Maia 100 — announced November 2023 with co-design from OpenAI; never made externally available, never rented to cloud customers [3]. Used internally for limited workloads only. Effectively a bring-up generation.
Maia 200 (codename "Braga") — announced January 2026 as the second generation [1][2]. Specs:
- TSMC 3nm, 140B+ transistors [3]
- 216 GB HBM3e
- >10 PFLOPS FP4, >5 PFLOPS FP8, in a 750W TDP [3]
- Deployed in Microsoft's Central region (Des Moines, Iowa) as of January 2026; expanding to West 3 (Phoenix) [1]
- Mass production was delayed by ~6 months due to OpenAI-requested design changes that destabilized the chip in simulation [7]

Software: Microsoft's internal AI compiler stack (proprietary; less mature than Neuron SDK or XLA, much less than CUDA).

2. Target Users & Pain Points

Effectively two captive customers:

Microsoft itself — Azure AI services, Microsoft 365 Copilot inference, Bing/Edge Copilot, Github Copilot serving
OpenAI — runs subset of GPT-5.2 family inference on Maia 200 [1], but the bulk of OpenAI compute remains NVIDIA + the Stargate buildout

External availability is planned but not yet broad — Scott Guthrie indicated "wider customer availability in the future" [1] but as of mid-2026 there is no public Maia 200 instance type.

3. Competitive Landscape

Chip	HBM	Status	Anchor
Maia 100	64 GB HBM2e	Internal-only, not externally available [3]	Microsoft (limited)
Maia 200	216 GB HBM3e [3]	Limited internal deployment, Iowa region	Microsoft + OpenAI (partial)
aws-trainium 2	96 GB HBM3	Production-scale, ~500K chips	Anthropic + AWS
google-tpu v6e (Trillium)	32 GB HBM	Production-scale, GA	Google + Anthropic
nvidia B200	192 GB HBM3e	Mass-market	Everyone

Among captive hyperscaler silicon, Microsoft is the third to ship at scale — visibly behind Google (TPU v5p shipping years ago) and Amazon (Trainium 2 already at gigawatt deployment).

4. Unique Observations

Maia 100 vs Maia 200 status — a frank read. Maia 100 was effectively a bring-up exercise that never reached customer rental [3]. Maia 200 is the real product; "first-party silicon from any major cloud provider" claim notwithstanding [2], it ships years after google-tpu and aws-trainium reached comparable deployment scale.
OpenAI as substrate co-designer is the most interesting structural story. OpenAI requested design changes that pushed Maia 200 mass production back ~6 months and destabilized simulations [7]. The level of OpenAI input on Microsoft's chip mirrors the Anthropic↔Trainium relationship — every major frontier lab now has a captive-silicon story attached to a hyperscaler. Whether OpenAI's stake in Maia 200 inference scales depends partly on the post-2025 OpenAI-Microsoft contract restructure (which moved OpenAI partially off Azure exclusivity).
216 GB HBM3e is genuinely class-leading. Larger than B200 (192 GB) and Trainium 2 (96 GB); only amd MI355X (288 GB) has more. For very large MoE serving, this is the actual technical wedge — "the most HBM-rich first-party silicon any cloud has built."
TSMC 3nm + 140B transistors [3] places Maia 200 on the same advanced-node tier as B200 — meaning Microsoft is competing for the same tsmc CoWoS allocation that NVIDIA is, in addition to building demand-side for itself. This is part of why CoWoS is sold out through 2026.
The captive-cost claim is unverifiable so far. Microsoft has not published $/inference comparisons vs NVIDIA for Maia 200. The strategic logic mirrors aws-inferentia/aws-trainium — drive Azure AI margins by reducing per-token COGS — but the proof is private. The 6-month production delay also implies Microsoft paid for redesign work that captive silicon vendors normally absorb without disclosure.
The "wider availability" tease. If Maia 200 reaches general Azure availability in late 2026/2027, it becomes a meaningful pricing comparable; if it stays internal-only like Maia 100, the program looks more like a Microsoft-OpenAI vertically integrated stack than a true open captive-silicon offer.

5. Financials / Funding

Parent: Microsoft (NASDAQ: MSFT); ~$3T+ market cap range mid-2026
Maia revenue: zero external; impact is through Azure AI margin
Microsoft AI capex (FY26): $80B+ guided; portion to Maia buildout undisclosed but materially smaller than NVIDIA spend
OpenAI commercial relationship: post-2025 restructure reduced Azure exclusivity; OpenAI now multi-homes across Microsoft (incl. Maia 200) + own Stargate + Oracle + others

6. People & Relationships

EVP of Microsoft Cloud + AI: Scott Guthrie
Custom silicon program: Rani Borkar (CVP, Azure Hardware Systems and Infrastructure)
Foundry: tsmc (3nm)
HBM: SK hynix (lead), Samsung, Micron
Co-designer: OpenAI (specific design requests on Maia 200 [7])
Captive consumers: Microsoft 365 Copilot, Azure AI, Bing/Edge Copilot, GitHub Copilot, Microsoft Security Copilot
External pilot customer: OpenAI (subset of GPT-5.2 inference) [1]
Direct competitors: google-tpu aws-trainium aws-inferentia nvidia amd