Azure API Management (APIM) is licensed across six tiers in 2026: Consumption (serverless, pay-per-call at ~$3.50 / million calls with the first million free), Developer (non-SLA, ~$48/month), Basic v2 (~$210/month), Standard v2 (~$734/month), Basic (~$155/month), Standard (~$704/month), and Premium (~$3,000/unit/month with multi-region scale-out, internal VNET integration, and self-hosted gateways at additional cost). The v2 tiers introduced in 2024 are the structural rebuild — lower entry price, integrated virtual-network support, and modernised gateway architecture. Premium remains the only tier with active-active multi-region, self-hosted gateway entitlement (4 free units per Premium unit, then $0.20/hour per additional unit), and the integration controls most regulated industries require. The biggest commercial fact for APIM in EA buyers: APIM consumption flows through Azure MACC, so right-sizing APIM tier against actual gateway throughput is the largest cost lever — we routinely see customers on Premium for workloads that fit Standard v2 at one-quarter the run-rate.
The six APIM tiers in 2026
Azure API Management was historically licensed in four classic tiers (Developer / Basic / Standard / Premium) plus a Consumption serverless option. The 2024 v2 release added two new tiers (Basic v2, Standard v2) that modernise the gateway architecture and reduce the entry price. Six tiers are now active in 2026:
| Tier | 2026 list price | SLA | Key constraints |
|---|---|---|---|
| Consumption | $0 base + ~$3.50/million calls (1M free) | 99.95% | Serverless, no VNet, no self-hosted gateway, max 50 APIs |
| Developer | ~$48/month per unit | None | Non-production only; do not deploy |
| Basic v2 | ~$210/month per unit | 99.95% | No multi-region; up to 4 capacity units |
| Standard v2 | ~$734/month per unit | 99.95% | VNet integration; up to 10 capacity units; no multi-region |
| Premium | ~$3,000/month per unit | 99.99% | Multi-region, internal VNet, self-hosted gateway included |
| Premium v2 (preview) | Tier-pricing varies | 99.99% | Modernised Premium — in preview through 2026 |
Three tier decisions matter most. Developer is non-SLA and explicitly not supported for production — we routinely see audit-time discoveries where Developer was deployed to production by mistake. Basic v2 vs Basic: v2 is materially cheaper and is the right default for any new non-Premium deployment. Premium vs Standard v2: the multi-region and self-hosted gateway entitlements drive the Premium decision; without those requirements, Standard v2 is structurally cheaper.
Capacity units: the sizing decision
APIM scales horizontally through capacity units. A capacity unit is a Microsoft-defined throughput envelope that varies by tier: ~2,500 requests/second on Premium, ~1,000 RPS on Standard v2, ~250 RPS on Basic v2. Throughput is approximate — actual capacity depends on policy complexity, payload size, and back-end latency. The pricing scales linearly: 4 Premium units = $12,000/month list before any EA discount.
The sizing trap: APIM defaults to over-provisioning. The Microsoft sizing recommendation typically assumes worst-case payload size and policy complexity, which inflates the recommended capacity unit count by 30–60% over actual observed throughput needs. The buyer-side counter is to instrument the gateway through Azure Monitor / Application Insights for two weeks before sizing, then size against observed p95 RPS not theoretical maximum.
Self-hosted gateways and Premium-only entitlements
The self-hosted gateway lets organisations run APIM gateway components in their own infrastructure — AKS, on-premises, edge — while keeping the management plane in Azure. Self-hosted gateways are exclusive to Premium tier, with the entitlement structure: 1 free self-hosted gateway included per APIM Premium unit, then $0.20/hour per additional gateway. For multi-cluster Kubernetes deployments, hybrid topologies, or edge gateway scenarios, the self-hosted entitlement is often the structural reason to be on Premium.
Two commercial considerations: the self-hosted gateway pricing of $0.20/hour ($144/month) per additional gateway adds up quickly in multi-cluster scenarios — 20 self-hosted gateways above the first include $2,880/month before counting the Premium unit itself. Customers running 10+ self-hosted gateways should compute whether multiple Premium units (each with 4 free self-hosted gateways) cost less than one Premium unit plus per-hour overflow.
MACC application and EA economics
APIM consumption is fully MACC-eligible. Every Premium unit, every capacity unit overflow, every self-hosted gateway hour, and every Consumption-tier call counts toward Microsoft Azure Consumption Commitment burn. For customers with a sized MACC, sizing APIM correctly drives whether the platform sits comfortably inside the commit or pushes it into over-burn.
The EA negotiation lever sequence for APIM-heavy environments: (1) right-size capacity units against observed throughput, (2) consolidate gateway instances into fewer Premium units where regional redundancy is genuinely required, (3) reserve self-hosted-gateway sprawl by computing the multi-Premium-unit breakeven, (4) explicitly include APIM consumption in MACC sizing rather than letting it land outside the commit at on-demand pricing.
EA negotiation levers for Azure API Management
- Premium-to-Standard-v2 downshift audit. For workloads without multi-region or self-hosted-gateway requirements, Standard v2 delivers the same operational posture at one-quarter the run-rate. The default Microsoft architecture suggestion is Premium; the buyer-side default is Standard v2 unless a requirement disqualifies it.
- Basic v2 vs Basic refactor. Every classic Basic unit should be assessed for migration to Basic v2 — lower price, modernised gateway, and forward-compatible architecture.
- Capacity-unit right-sizing. Default sizing is consistently 30–60% over actual observed throughput. Instrument before sizing.
- Self-hosted gateway multi-unit breakeven. Customers with 8+ self-hosted gateways often save by sizing multiple Premium units rather than one Premium plus per-hour overflow.
- MACC inclusion of APIM consumption. APIM consumption pulled into the MACC commit reduces effective per-unit cost through the EA discount on the consumption layer.
- Developer-tier production audit. Audit every Developer-tier instance against actual deployment topology — production deployment on Developer tier voids the non-SLA disclaimer and surfaces in any Microsoft architecture review.
Anonymised case study: $620K APIM run-rate reduction
A 4,400-employee fintech enterprise carried 4 Premium units in production ($144K annualised), 3 Standard units in pre-production ($25K annualised), and 22 self-hosted gateways above the Premium entitlement ($38K annualised), running a microservices architecture with ~85 internal APIs. We audited the topology. Premium-to-Standard-v2 downshift: 2 of 4 Premium units served a single region with no self-hosted gateway requirement — downshifted to Standard v2, $54K annualised reduction. Self-hosted gateway consolidation: the 22 overflow gateways consolidated to 16 gateways through service-mesh routing rework, $11K annualised reduction. Capacity-unit right-sizing: the remaining 2 Premium units were 3-unit sized; instrumented and right-sized to 2-unit, $54K annualised reduction. Standard pre-production migration to Basic v2: 3 Standard units → 3 Basic v2 units, $19K annualised reduction. MACC alignment: APIM consumption explicitly included in renewal MACC sizing at the discounted EA rate, capturing an additional $482K of MACC headroom across the platform. Combined annualised APIM run-rate reduction: $620K against the LSP renewal projection.
Azure API Management is one of the most-over-sized Azure services we see in EA renewal audits. Pair the APIM audit with the broader MACC structure, the Azure commit-discount levers, the 2026 EA tier-collapse landscape, and the Azure & MACC advisory that turns APIM from a cost line into a managed capability.