Microsoft Maia 100/200
Microsoft's third-place captive silicon — Maia 100 was a no-show, Maia 200 is the inference chip co-designed with OpenAI that finally ships.
1. Core Product / Service
Maia is Microsoft's custom AI accelerator family, built in-house with deep input from OpenAI. Two generations exist on paper:
- Maia 100 — announced November 2023 with co-design from OpenAI; never made externally available, never rented to cloud customers [3]. Used internally for limited workloads only. Effectively a bring-up generation.
- Maia 200 (codename "Braga") — announced January 2026 as the second generation [1][2]. Specs:
- TSMC 3nm, 140B+ transistors [3]
- 216 GB HBM3e
- >10 PFLOPS FP4, >5 PFLOPS FP8, in a 750W TDP [3]
- Deployed in Microsoft's Central region (Des Moines, Iowa) as of January 2026; expanding to West 3 (Phoenix) [1]
- Mass production was delayed by ~6 months due to OpenAI-requested design changes that destabilized the chip in simulation [7]
Software: Microsoft's internal AI compiler stack (proprietary; less mature than Neuron SDK or XLA, much less than CUDA).
2. Target Users & Pain Points
Effectively two captive customers:
- Microsoft itself — Azure AI services, Microsoft 365 Copilot inference, Bing/Edge Copilot, Github Copilot serving
- OpenAI — runs subset of GPT-5.2 family inference on Maia 200 [1], but the bulk of OpenAI compute remains NVIDIA + the Stargate buildout
External availability is planned but not yet broad — Scott Guthrie indicated "wider customer availability in the future" [1] but as of mid-2026 there is no public Maia 200 instance type.
3. Competitive Landscape
| Chip | HBM | Status | Anchor |
|---|---|---|---|
| Maia 100 | 64 GB HBM2e | Internal-only, not externally available [3] | Microsoft (limited) |
| Maia 200 | 216 GB HBM3e [3] | Limited internal deployment, Iowa region | Microsoft + OpenAI (partial) |
| aws-trainium 2 | 96 GB HBM3 | Production-scale, ~500K chips | Anthropic + AWS |
| google-tpu v6e (Trillium) | 32 GB HBM | Production-scale, GA | Google + Anthropic |
| nvidia B200 | 192 GB HBM3e | Mass-market | Everyone |
Among captive hyperscaler silicon, Microsoft is the third to ship at scale — visibly behind Google (TPU v5p shipping years ago) and Amazon (Trainium 2 already at gigawatt deployment).
4. Unique Observations
- Maia 100 vs Maia 200 status — a frank read. Maia 100 was effectively a bring-up exercise that never reached customer rental [3]. Maia 200 is the real product; "first-party silicon from any major cloud provider" claim notwithstanding [2], it ships years after google-tpu and aws-trainium reached comparable deployment scale.
- OpenAI as substrate co-designer is the most interesting structural story. OpenAI requested design changes that pushed Maia 200 mass production back ~6 months and destabilized simulations [7]. The level of OpenAI input on Microsoft's chip mirrors the Anthropic↔Trainium relationship — every major frontier lab now has a captive-silicon story attached to a hyperscaler. Whether OpenAI's stake in Maia 200 inference scales depends partly on the post-2025 OpenAI-Microsoft contract restructure (which moved OpenAI partially off Azure exclusivity).
- 216 GB HBM3e is genuinely class-leading. Larger than B200 (192 GB) and Trainium 2 (96 GB); only amd MI355X (288 GB) has more. For very large MoE serving, this is the actual technical wedge — "the most HBM-rich first-party silicon any cloud has built."
- TSMC 3nm + 140B transistors [3] places Maia 200 on the same advanced-node tier as B200 — meaning Microsoft is competing for the same tsmc CoWoS allocation that NVIDIA is, in addition to building demand-side for itself. This is part of why CoWoS is sold out through 2026.
- The captive-cost claim is unverifiable so far. Microsoft has not published $/inference comparisons vs NVIDIA for Maia 200. The strategic logic mirrors aws-inferentia/aws-trainium — drive Azure AI margins by reducing per-token COGS — but the proof is private. The 6-month production delay also implies Microsoft paid for redesign work that captive silicon vendors normally absorb without disclosure.
- The "wider availability" tease. If Maia 200 reaches general Azure availability in late 2026/2027, it becomes a meaningful pricing comparable; if it stays internal-only like Maia 100, the program looks more like a Microsoft-OpenAI vertically integrated stack than a true open captive-silicon offer.
5. Financials / Funding
- Parent: Microsoft (NASDAQ: MSFT); ~$3T+ market cap range mid-2026
- Maia revenue: zero external; impact is through Azure AI margin
- Microsoft AI capex (FY26): $80B+ guided; portion to Maia buildout undisclosed but materially smaller than NVIDIA spend
- OpenAI commercial relationship: post-2025 restructure reduced Azure exclusivity; OpenAI now multi-homes across Microsoft (incl. Maia 200) + own Stargate + Oracle + others
6. People & Relationships
- EVP of Microsoft Cloud + AI: Scott Guthrie
- Custom silicon program: Rani Borkar (CVP, Azure Hardware Systems and Infrastructure)
- Foundry: tsmc (3nm)
- HBM: SK hynix (lead), Samsung, Micron
- Co-designer: OpenAI (specific design requests on Maia 200 [7])
- Captive consumers: Microsoft 365 Copilot, Azure AI, Bing/Edge Copilot, GitHub Copilot, Microsoft Security Copilot
- External pilot customer: OpenAI (subset of GPT-5.2 inference) [1]
- Direct competitors: google-tpu aws-trainium aws-inferentia nvidia amd