Azure OpenAI Service

Microsoft's hosted entry for OpenAI models — physically sharing underlying compute with OpenAI (sometimes the other way around, OpenAI uses Azure), capturing the largest enterprise share of GPT via enterprise compliance + regional deployment + investment relationship.

1. Core Product / Service

Azure OpenAI Service (AOAI) is the only official distribution channel for OpenAI models outside OpenAI's own offering, and Microsoft's flagship product at L3b.

Model menu: GPT-5 series (GPT-5, GPT-5-Pro, GPT-5-Mini, GPT-5-Nano, synced release with OpenAI), GPT-4o / GPT-4o-mini / GPT-4.1, o3 / o4-mini / o3-mini, DALL-E 3, Whisper, TTS, text-embedding-3-large/small.
Deployment types: Standard (shared pay-per-token), Provisioned Throughput Units (PTU) (reserved capacity, hourly billing, low-latency guarantee), Global Standard / DataZone Standard (cross-region routing optimizing throughput/price), Batch (async, 50% discount).
Enterprise features: Azure Active Directory integration, customer-managed keys, private endpoints, VNet integration, Azure Policy compliance, GDPR / HIPAA / FedRAMP High.
AI Foundry / Azure AI Studio: upper-layer RAG / agent / fine-tuning framework, gluing OpenAI models + Azure data services + Microsoft Graph (Office 365).
Microsoft first-party models: Phi-4 series, self-trained Mistral etc. also under the same AOAI entry; Azure AI Foundry also hosts Llama / DeepSeek / Mistral / Cohere (this part is similar to Bedrock / Vertex), but the AOAI product name strictly refers to OpenAI / Microsoft models.

2. Target Users & Pain Points

Microsoft Office / Azure large customers: already running data on Azure + using Microsoft 365 Copilot, AOAI is a contract extension; they won't go sign with OpenAI directly.
European / regulated-region enterprises: OpenAI direct API historically wasn't fully available in all regions / data residency was inflexible; AOAI offers dedicated deployments + data residency commitments in EU / UK / Australia / Canada and other regions.
Large government / defense: GPT is available on Azure Government Cloud — no equivalent OpenAI direct API channel.
Pain points: compliance approval + regional residency + existing Azure contracts → the three main reasons AOAI is preferred over OpenAI direct.

3. Competitive Landscape

Competitor	Positioning	Vs. AOAI
OpenAI direct API	1P direct	Price near parity, but OpenAI direct models are released first; AOAI is delayed weeks to months; enterprise compliance usually pushes AOAI
aws-bedrock	Anthropic Claude main venue	Mirror positions: Bedrock = AWS+Claude, AOAI = Azure+GPT
gcp-vertex	Google Gemini + Claude + Mistral	Vertex multi-model hedge; AOAI is committed to GPT
together-ai / fireworks-ai	3P open-source token API	Lower prices; but GPT is irreplaceable

Differentiation: OpenAI model exclusive non-first-party distribution + Azure AD/compliance ecosystem + Office 365 Copilot traffic foundation.

4. Unique Observations

Per-token pricing (Standard, 2026-05):
- GPT-5: $1.25/M input + $10/M output — parity with OpenAI direct
- GPT-5-Mini: $0.25/M input + $2/M output — parity
- GPT-5-Nano: $0.05/M input + $0.40/M output — parity
- GPT-4o: $2.50/M input + $10/M output — parity
- GPT-4o-mini: $0.15/M input + $0.60/M output — parity
- o3-mini: $1.10/M input + $4.40/M output — parity [1]
vs 1P price gap (take rate): strict price parity, take rate nominally 0%. Microsoft recoups value through other dimensions:
1. Microsoft holds 49% of OpenAI's economic rights (publicly reported) — OpenAI runs GPT training + inference on Azure, generating large compute spend that flows back to Azure revenue;
2. Office 365 / Microsoft 365 Copilot packages GPT into $30/month/seat SKUs, where take rate sits at the application layer (not at AOAI);
3. AOAI customers inevitably incur other Azure service spend (storage / networking / data), where take rate is actually higher.
vs third parties: OpenAI models have no other legal token API channel outside AOAI — this is different from Bedrock / Vertex's "hard exclusivity." Claude on Bedrock is also available on Vertex / Anthropic direct; but GPT has no plan B outside OpenAI / AOAI.
Inference engine: not publicly disclosed — treated as OpenAI in-house stack (Triton / self-developed) + Azure NDv5 / NVIDIA H100 / H200 / B200 GPU pool. Microsoft also partially uses its in-house chip Maia 100 (2024 launch) to run some AOAI workload — another case of hyperscaler in-house chips amortizing inference costs.
Compute sourcing: 100% Azure first-party data centers — Microsoft has invested ~$13B+ in OpenAI, providing massive H100 capacity for OpenAI training + AOAI inference. OpenAI used Azure GPUs to train GPT-5; inference servers are shared between OpenAI direct + AOAI capacity.
Hidden take rate mechanism: OpenAI's compute spend on Azure ≈ Microsoft's discount on GPU rental ≠ market price. Microsoft + OpenAI's financial structures are intertwined; the "parity pricing" of token API is just appearance — actual profit sharing is determined by private terms between the two companies.
Lagging launch: After GPT-4o release, AOAI lagged ~1-2 weeks; some niche features (Realtime API, early fine-tuning versions) lag further on AOAI. This is the trade-off for enterprise users vs OpenAI direct: compliance in exchange for speed of access.
Strategic risk: The OpenAI / Microsoft relationship has been tense multiple times in 2024-2026 (OpenAI exploring other clouds, Stargate project pulling OpenAI toward Oracle / SoftBank); if OpenAI renegotiates to multi-cloud GPT, AOAI's "hard exclusivity" narrative breaks down.

5. Financials / Business Scale

GA date: 2023-01
Microsoft investment in OpenAI: cumulatively ~$13B+ (including compute subsidy, cash); Microsoft holds OpenAI economic rights (49% profit share, up to a specific return cap).
AOAI revenue: Microsoft does not separately disclose, but 2024 earnings mentioned multiple times "AI business surpassing $13B ARR" mainly driven by Copilot + AOAI.
Customers: 60,000+ Azure OpenAI enterprise customers (self-reported); including Coca-Cola, Mercedes, Moody's, ICRC, KPMG, Unilever, BMW, US Army (CoPilot Air Force).

6. People & Relationships

Parent: Microsoft Azure, Satya Nadella (CEO Microsoft), Scott Guthrie (EVP Cloud + AI).
Strategic partner: OpenAI — Sam Altman, Greg Brockman; OpenAI is the major compute consumer mutually bound with Azure.
Microsoft in-house AI team: Mustafa Suleyman (CEO Microsoft AI, joined 2024; co-founder of DeepMind / Inflection), Microsoft Research / Phi team.
Competes with: aws-bedrock, gcp-vertex, OpenAI direct API (internal "friendly competition"), together-ai / fireworks-ai (in open-source model tier).
Hosts models from: OpenAI (full GPT + DALL-E + Whisper + Embeddings), Microsoft (Phi series); AOAI same portal also contains Llama / DeepSeek / Mistral / Cohere / xAI Grok etc. (under Azure AI Foundry, not strict AOAI menu).

Sources

[1] https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/ (2026-05-10)
[2] https://azure.microsoft.com/en-us/products/ai-services/openai-service (2026-05-10)
[3] https://news.microsoft.com/2023/01/23/microsoftandopenaiextendpartnership/ (2026-05-10)
[4] https://learn.microsoft.com/en-us/azure/ai-services/openai/ (2026-05-10)