Azure Rightsizing: A Practical Guide | Microsoft Negotiations

Why Azure Advisor Is Not Enough

Azure Advisor is the starting point for most enterprise rightsizing programmes — and the ending point for most that fail to deliver meaningful results. Microsoft's native recommendation engine applies conservative thresholds (sub-5% CPU utilisation over 14 days), focuses primarily on compute, and cannot account for the organisational dynamics that determine whether a recommendation can actually be acted upon. The result is a rightsizing analysis that identifies 5–10% of Azure spend as addressable, when the genuine opportunity is typically 20–35%.

The gap between what Azure Advisor surfaces and the total rightsizing opportunity exists because enterprise Azure waste is distributed across five categories, of which compute oversizing is only one. The other four — orphaned resources, development environment inefficiency, storage waste, and PaaS over-provisioning — require analysis that Azure Advisor does not perform, using thresholds that are relevant to enterprise environments rather than conservative enough to avoid false positives on mission-critical production systems.

This guide provides a practical framework for identifying and eliminating waste across all five categories, structured as a 90-day programme that delivers measurable cost reduction while avoiding the operational risk that derails many rightsizing initiatives. The broader Azure cost context — including Reserved Instance strategy, AHUB, and MACC negotiation — is in the Azure Cost Optimisation Complete Guide.

28%

Average proportion of Azure spend identified as immediately addressable waste in enterprise rightsizing analyses — combining all five waste categories with appropriate thresholds. Azure Advisor alone typically surfaces 5–10%.

The Five Categories of Enterprise Azure Waste

Structuring rightsizing around the five waste categories allows effort to be directed where the financial impact is greatest, and allows different categories to be addressed in parallel by different teams without creating dependencies that slow the overall programme.

1. Orphaned Resources

Typical savings: 3–6% of Azure spend

Managed disks with no VM attachment, public IP addresses unused, empty resource groups with DNS zones, load balancers with no backends. Immediately eliminable.

2. Idle and Oversized VMs

Typical savings: 8–15% of Azure spend

VMs at sub-20% peak CPU combined with low network/memory utilisation. The largest single waste category by value in most enterprise estates.

3. Development Environment Inefficiency

Typical savings: 4–8% of Azure spend

Dev/test VMs running 24/7 at production VM sizes. Auto-shutdown and start/stop automation capture 20–25% of dev/test cost within 30 days.

4. Storage Waste

Typical savings: 3–5% of Azure spend

Premium SSD where standard SSD/HDD suffices. No snapshot retention policy. Underuse of cool/archive tiers. Hot blobs that haven't been accessed in 90+ days.

5. PaaS SKU Over-Provisioning

Typical savings: 4–8% of Azure spend

Azure SQL, App Service Plans, AKS node pools, Databricks clusters sized for maximum load rather than typical load. Elastic pools, autoscale, and spot instances address this category.

Going Beyond Azure Advisor: The Toolset

Azure Advisor provides CPU utilisation data from Azure Monitor, but it cannot correlate CPU with memory, network, and disk I/O simultaneously — which means a VM running at 4% CPU but 90% memory utilisation appears as a rightsizing candidate when it is actually appropriately sized for a memory-intensive workload. Using Azure Advisor recommendations without this additional context leads to rightsizing actions that trigger performance problems, destroying confidence in the programme and causing stakeholders to block further optimisation.

The Correct Data Sources for Each Waste Category

For orphaned resources: Azure Resource Graph queries provide the most comprehensive view of unattached disks, unused public IPs, empty resource groups, and load balancers without backend pools. Resource Graph supports complex queries across subscriptions that Azure Advisor cannot match. A well-constructed KQL query set can inventory all orphaned resource types across an entire EA enrolment in minutes.

For idle and oversized VMs: Azure Monitor metrics with a 90-day lookback (not the 14-day window Azure Advisor uses) across CPU, memory, network in/out, and disk I/O simultaneously. A VM should only be considered a rightsizing candidate if all four metrics show consistent underutilisation — not just CPU. This eliminates the false positives that come from single-metric analysis.

For development environments: Azure Tags are the prerequisite. Without consistent environment tagging (Production / Staging / Development / Test), it is impossible to identify dev/test resources systematically. The first step in this category is often remediating the tagging gap rather than implementing auto-shutdown directly.

For storage waste: Azure Cost Management storage cost reports combined with Azure Storage Inventory (a feature that generates a comprehensive report of all blobs, containers, and files with last-access timestamps). Storage that has not been accessed in 90+ days and is sitting in the Hot tier is an immediate candidate for Cool tier migration.

For PaaS over-provisioning: Service-specific metrics from Azure Monitor. Azure SQL Database CPU, DTU utilisation, and connection count; App Service Plan CPU and memory; AKS node pool utilisation across the cluster; Azure Databricks cluster utilisation by job. Each PaaS service requires a specific analysis approach — there is no generic rightsizing methodology that works across all PaaS services.

Waste Category	Primary Tool	Analysis Period	Action Type
Orphaned Resources	Azure Resource Graph	Point-in-time	Delete (no operational risk)
Idle/Oversized VMs	Azure Monitor (90-day, multi-metric)	90 days	Resize or deallocate
Dev Environment	Azure Tags + Cost Management	30 days	Auto-shutdown policy
Storage	Storage Inventory + Access Tiers	90 days last-access	Tier migration
PaaS SKU	Service-specific Monitor metrics	30–90 days	SKU downsize / autoscale

The 90-Day Rightsizing Programme

Rightsizing initiatives that are structured as open-ended cost reduction programmes without time-bound phases consistently deliver slower and smaller savings than the underlying opportunity justifies. A 90-day programme with defined outputs for each phase creates the urgency and accountability that enterprise organisations need to move from analysis to action.

Days 1–15: Inventory and Triage

Run Resource Graph queries across all EA subscriptions to identify orphaned resources. Pull 90-day Azure Monitor data for all VMs. Export Storage Inventory reports. Tag remediation for environment classification where tagging is absent. Produce a waste register: resource ID, waste category, estimated monthly saving, operational risk assessment (none / low / medium / high), team owner.

Days 15–30: Zero-Risk Elimination

Execute deletion of all zero-operational-risk orphaned resources identified in Phase 1: unattached disks, unused public IPs, empty resource groups, load balancers without backends. Implement auto-shutdown policies on confirmed development VMs. Execute storage tier migrations for blobs with 90+ day last-access dates. Target: 5–8% Azure cost reduction within 30 days.

Days 30–60: VM Rightsizing

Present the VM rightsizing register to workload owners with utilisation evidence. Require written approval from the application team for each resize action. Implement in batches with a pre-agreed rollback window (typically 48–72 hours). Document actual post-resize performance metrics to validate the recommendation and build confidence in the programme for subsequent rounds.

Days 60–90: PaaS Optimisation and Automation

Implement autoscale on App Service Plans and AKS node pools where load patterns support it. Resize Azure SQL Databases with consistently low DTU/CPU utilisation. Explore Elastic Pool consolidation for SQL Databases with complementary usage patterns. Deploy Azure Policy guardrails to prevent large-SKU deployments without approval. Produce a savings realisation report against the initial waste register.

Azure Waste Elimination Programme

Independent 90-day rightsizing programme across all five waste categories. We deliver the analysis, manage the stakeholder engagement, and track the savings realisation.

Start the Programme

The Organisational Challenge: Why Rightsizing Programmes Fail

The technical dimensions of rightsizing are genuinely straightforward. The organisational dimensions are where most programmes fail. Understanding the failure patterns in advance allows programme design to avoid them.

Failure Pattern 1: The Resource Owner Veto

In most enterprise organisations, the team that provisioned a resource has implicit veto power over any rightsizing action on it. This veto is exercised through a combination of technical uncertainty ("I'm not sure the application can handle a smaller VM") and organisational inertia ("we've always run this way"). Without executive mandate that shifts the burden of proof — from "prove it's safe to resize" to "prove you need this size" — rightsizing programmes stall at the engagement phase.

The solution is executive sponsorship that is visible, specific, and carries consequences. A message from the CTO or CFO that says "we will resize or shut down all VMs with sub-20% peak CPU by the end of Q3 unless application teams can demonstrate a business case for current sizing" is a different mandate than "please review your VM utilisation." Only the former produces action at scale.

Failure Pattern 2: Resurface Time

A rightsizing programme that achieves 25% Azure cost reduction at the end of Q2 and then watches costs return to 85% of previous levels by Q4 is not a success — it is a demonstration that the organisation cannot sustain savings without governance infrastructure. The issue is not the technical rightsizing; it is the absence of controls that prevent re-provisioning at previous sizes.

The governance controls that prevent resurface are: Azure Policy definitions that block deployment of VM SKUs above a defined size without a documented approval workflow; budget alerts calibrated to pre-rightsizing spend levels that trigger automatically when costs approach previous baselines; and a quarterly rightsizing review cycle where the waste register is re-run and new waste is identified and actioned within 30 days.

Failure Pattern 3: Analysis Paralysis

Enterprise rightsizing programmes with too many stakeholders, too many approval gates, and too much analysis time consistently deliver results 6–12 months later than necessary. The 90-day programme structure addresses this directly by forcing time-bound actions and limiting the approval complexity to a binary: approve the rightsizing action within 5 business days or it proceeds automatically.

The binary approval constraint is uncomfortable for organisations accustomed to open-ended review cycles, but it is the only mechanism that prevents analysis paralysis. Application teams that have 30 days to object to a VM resize will use all 30 days and then request an extension. Application teams that have 5 days will review the utilisation data, confirm there is no risk, and approve.

Programme Design Principle

The most important design decision in any enterprise rightsizing programme is the approval structure. Binary approval with a default-proceed outcome (the action proceeds unless the owner objects with evidence) consistently delivers 3–5x more savings than opt-in approval structures (the action requires positive approval to proceed). The difference is entirely organisational, not technical.

Rightsizing and Reserved Instance Interaction

One of the most common and costly errors in enterprise Azure cost management is implementing rightsizing and Reserved Instances as independent workstreams without coordinating them. The error creates two specific problems that erode savings from both programmes.

First, purchasing RIs before rightsizing means committing to a specific VM size that may subsequently be identified as oversized. A 3-year RI for a Standard_D16s_v5 that gets rightsized to a Standard_D8s_v5 leaves an RI commitment that either wastes capacity or requires an exchange — with the friction and potential cost that entails.

Second, rightsizing after RI purchase means the rightsizing analysis must account for existing RI commitments. A VM covered by a 3-year RI that would be a rightsizing candidate if it were at pay-as-you-go rates is not a financial rightsizing candidate — the RI cost is sunk regardless of whether the VM runs at the committed size. Rightsizing it reduces performance headroom without reducing cost.

The correct sequencing is to complete the rightsizing programme first — or at minimum to complete the rightsizing analysis and identify the confirmed-stable-size workloads — before purchasing RIs. RI commitments should be based on post-rightsizing VM sizes, not pre-rightsizing sizes. The decision framework for RI vs Savings Plan selection is in the dedicated comparison guide.

Placing Rightsizing in the Broader Azure Cost Picture

Rightsizing is the most tactically immediate Azure cost lever — some categories of waste can be eliminated within 30 days of project initiation. But it is also the lever with the shallowest long-term impact if not sustained through governance, and the one that produces the most internal friction due to its direct engagement with application teams.

The commercial levers — MACC negotiation, AHUB activation, RI restructuring — have longer lead times but produce savings that are structurally durable: they apply to consumption automatically without per-resource action, and they do not erode as new resources are provisioned. The optimal Azure cost programme combines all four levers: start with rightsizing for immediate impact, layer in AHUB activation (60-day programme, high-ROI), restructure RI commitments based on post-rightsizing stable-size workloads, and negotiate the MACC in the context of the EA renewal.

The combined outcome of all four levers, implemented in sequence over 12 months, consistently delivers 35–45% total Azure cost reduction — from the same workloads, with no reduction in capability or performance. The full programme framework is in the Azure Cost Optimisation Complete Guide and the Azure Cost Optimisation White Paper.

For organisations approaching an EA renewal alongside an Azure cost optimisation programme, the timing coordination between the optimisation work and the commercial negotiation is material. An organisation that has completed rightsizing and AHUB activation before entering MACC negotiations has a more defensible consumption forecast and a stronger commercial position than one that is optimising and negotiating simultaneously. The Azure Cost Management advisory service manages this sequencing as part of the engagement framework.

Azure Rightsizing: A Practical Guide for Enterprise Organisations

Why Azure Advisor Is Not Enough

The Five Categories of Enterprise Azure Waste

1. Orphaned Resources

2. Idle and Oversized VMs

3. Development Environment Inefficiency

4. Storage Waste

5. PaaS SKU Over-Provisioning

Going Beyond Azure Advisor: The Toolset

The Correct Data Sources for Each Waste Category

The 90-Day Rightsizing Programme

Days 1–15: Inventory and Triage

Days 15–30: Zero-Risk Elimination

Days 30–60: VM Rightsizing

Days 60–90: PaaS Optimisation and Automation

The Organisational Challenge: Why Rightsizing Programmes Fail

Failure Pattern 1: The Resource Owner Veto

Failure Pattern 2: Resurface Time

Failure Pattern 3: Analysis Paralysis

Rightsizing and Reserved Instance Interaction

Placing Rightsizing in the Broader Azure Cost Picture

The Microsoft licensing briefing — 3 minutes, every Friday

Azure Cost Optimisation: Complete Guide

Azure Cost Optimisation White Paper — 34pp

Azure Rightsizing Assessment

Azure Rightsizing: A Practical Guide for Enterprise Organisations

Why Azure Advisor Is Not Enough

The Five Categories of Enterprise Azure Waste

1. Orphaned Resources

2. Idle and Oversized VMs

3. Development Environment Inefficiency

4. Storage Waste

5. PaaS SKU Over-Provisioning

Going Beyond Azure Advisor: The Toolset

The Correct Data Sources for Each Waste Category

The 90-Day Rightsizing Programme

Days 1–15: Inventory and Triage

Days 15–30: Zero-Risk Elimination

Days 30–60: VM Rightsizing

Days 60–90: PaaS Optimisation and Automation

The Organisational Challenge: Why Rightsizing Programmes Fail

Failure Pattern 1: The Resource Owner Veto

Failure Pattern 2: Resurface Time

Failure Pattern 3: Analysis Paralysis

Rightsizing and Reserved Instance Interaction

Placing Rightsizing in the Broader Azure Cost Picture

The Microsoft licensing briefing — 3 minutes, every Friday

Azure Cost Optimisation: Complete Guide

Azure Cost Optimisation White Paper — 34pp

Azure Rightsizing Assessment

Azure Reserved Instances vs Savings Plans

How to Activate Azure Hybrid Benefit

Azure MACC vs AWS EDP vs Google CUD

Related reading