Azure Container Apps pricing sits between Functions (event-driven, fully serverless) and AKS (raw Kubernetes you operate yourself). Two commercial models compete: the Consumption plan (per-vCPU-second, per-GiB-second, per-million-requests — with scale-to-zero) and Dedicated workload profiles (per-node-hour on managed compute). The four levers that cut ACA spend 30–50%: enable scale-to-zero on every non-critical app, set min-replicas to 1 only where cold-start latency is intolerable, choose the Consumption plan for bursty/idle workloads and Dedicated for sustained workloads above ~6 vCPUs, and audit egress-heavy apps that should sit behind a CDN or Front Door rather than direct-replica scale-out.
How Azure Container Apps actually bills
Azure Container Apps pricing meters three things on the Consumption plan and one thing on Dedicated:
- Consumption plan vCPU-seconds: per-second metered while a replica is active (scaled above zero); free monthly grant of vCPU-seconds covers light workloads.
- Consumption plan memory GiB-seconds: per-second metered for allocated memory; free monthly grant applies.
- Consumption plan requests: per-million HTTP requests; free monthly grant applies.
- Dedicated workload profile node-hours: per-node-hour for the managed VMs the workload profile reserves (D-, E-, NC-series); replicas billed against this capacity rather than per second.
The Consumption plan rewards apps that scale to zero or near-zero; the Dedicated profile rewards apps that run sustained workloads above the Consumption break-even point or require GPU, larger memory, or zone-redundancy features unavailable on Consumption-only environments.
Consumption plan vs Dedicated workload profile
The decision pivots on average utilisation:
| Pattern | Best on | Reason |
|---|---|---|
| Idle most of the time, brief bursts | Consumption with scale-to-zero | You pay nothing when idle. Cold start ~1–3 s acceptable. |
| Sustained low traffic, small replicas always on | Consumption with min-replicas 1 | Cheaper than reserving a node-hour for the same footprint. |
| Sustained moderate traffic, multiple replicas always on | Dedicated | Crossover ~6 vCPU sustained; node-hour beats per-vCPU-second. |
| GPU workloads, >4 GiB memory per replica, zone-redundant | Dedicated | Consumption-only environments don't support these features. |
Scale-to-zero: the headline lever
Scale-to-zero is the single biggest economic feature of Container Apps. An app with min-replicas 0 pays nothing during idle periods — only the cold-start latency on the first request after idle. For non-customer-facing apps (background workers, internal APIs, admin tools, dev/test environments), the cold-start cost is irrelevant; for customer-facing apps with intermittent traffic, the latency may not be.
The audit lever: inventory every Container App with min-replicas >= 1; categorise by customer-facing vs internal; flip internal apps and dev/test to min-replicas 0 with appropriate warm-up via scheduled health checks. Most enterprise estates have 50–70% of apps that could safely scale to zero but don't because the default was 1.
The Container Apps quickstart sets min-replicas to 1 by default for "predictable performance." This is convenient but expensive at scale. Microsoft account teams don't volunteer the scale-to-zero conversation for non-critical apps because it reduces consumption. The buyer's posture: scale-to-zero by default, opt in to min-replicas 1 only where measured cold-start cost exceeds the standing-replica cost. Tie the decision to your MACC commitment rather than to the app team's comfort.
Replica sizing and scale rules
Replica sizing (vCPU and memory allocations per replica) directly determines cost per scaled-out unit. Defaults are generous to ensure container start succeeds; production rarely needs the defaults. The right posture: profile the container, set vCPU/memory to actual P95 plus 20% headroom, configure scale rules (HTTP concurrency, CPU, KEDA-style external metric) to scale by demand rather than by replica count threshold.
The most common waste pattern: max-replicas set to 10 or 30 "in case of burst" with no monitoring or alert. Runaway scaling under malformed requests or denial-of-service-style traffic produces meaningful unexpected charges. Set max-replicas to a realistic ceiling and alert on sustained scaling near the ceiling.
Egress optimisation
Container Apps egress bills at standard Azure egress rates. For apps serving static or cacheable content, putting Azure Front Door or Azure CDN in front of Container Apps reduces egress cost dramatically (the cache absorbs most of the traffic). For app-to-app traffic, keeping replicas in the same Container Apps environment uses internal networking and avoids cross-region or cross-VNet egress charges.
Anonymised case study: $215K Container Apps reduction
A SaaS client ran 47 Container Apps across two Container Apps environments, with $480K/year ACA spend. The audit found: 32 of 47 apps had min-replicas 1 (only 8 needed sub-second cold-start); replica sizing defaulted to 1 vCPU / 2 GiB despite measured P95 of 0.3 vCPU / 0.7 GiB; max-replicas defaulted to 10 across all apps with no alerting; one chatty internal API was egress-heavy and didn't sit behind Front Door. Remediation: 24 apps moved to min-replicas 0; replica vCPU/memory right-sized; max-replicas set per-app with alerting; internal API moved behind Front Door with sensible caching. Annual saving: $215K (45% of prior spend). The client now benchmarks Container Apps against Functions for new event-driven workloads.
The Microsoft Licensing Briefing — 3 minutes, every Friday
Independent analysis of Microsoft commercial moves, with implications for your EA and Azure commit. No vendor spin.
No spam. Unsubscribe any time.
Where to take this from here
Container Apps is the cheapest managed container option in the Azure portfolio for the right workload shape. Sequence the work: scale-to-zero audit first; plan selection (Consumption vs Dedicated) second; replica sizing third; egress and caching fourth. Pair with AKS licensing if you are evaluating the move from raw Kubernetes, Functions consumption vs Premium for event-driven alternatives, and App Service pricing for web-app patterns where ACA may not fit. For commitment design, MACC explainer. For renewal leverage, the EA tier collapse 2026 playbook. For end-to-end support, our Azure & MACC Advisory covers compute platform selection as part of total Azure cost discipline. Request a discovery call to benchmark.