The single most expensive Fabric mistake is buying a capacity SKU without measuring actual workload requirements first. Enterprises that guess at sizing — or rely on Microsoft's "recommended" sizing (which errs large) — routinely over-provision by 2–4x. At $3,000–$25,000/month per capacity, a 2x over-provision costs $36,000–$300,000/year. Equally dangerous: under-provisioning causes throttling that makes Fabric feel broken, triggering expensive emergency upgrades. Both scenarios are avoidable with a structured capacity planning approach.
Independent Advisory. Zero Vendor Bias.
500+ Microsoft EA engagements. $2.1B in managed spend. 32% average cost reduction. We negotiate on your behalf — never Microsoft's.
View Advisory Services →Understanding Capacity Units (CU) and How Fabric Consumes Them
Microsoft Fabric measures compute in Capacity Units (CU). Each Fabric workload consumes CUs at different rates depending on the operation type, data volume, and concurrency. The key properties:
- CU consumption is continuous: A running Spark job consumes CUs in real time. A Power BI report query consumes a burst of CUs for the duration of the query and then releases them.
- CU consumption is workload-specific: A Power BI refresh consumes CUs differently from a Spark batch job. Real-time intelligence (KQL) has a continuous low baseline. Data Factory pipelines burst during execution.
- CU consumption is time-shared: Your F64 (64 CU) capacity can serve a 100 CU Spark job and a 30 CU Power BI refresh simultaneously — up to the smoothing threshold — then queues the remainder. This is the smoothing model (see below).
CU Consumption by Workload Type
| Fabric Workload | CU Consumption Pattern | Typical CU Range | Duration Pattern |
|---|---|---|---|
| Power BI Interactive Report Query | Burst spike on query, release immediately | 0.5–20 CU per query | Seconds |
| Power BI Scheduled Refresh (small model) | Sustained during refresh | 5–30 CU | Minutes |
| Power BI Scheduled Refresh (large model >1 GB) | Sustained high during refresh | 20–128 CU | 10–60 min |
| Dataflow Gen2 | Sustained moderate | 4–32 CU | Minutes to hours |
| Data Engineering (Spark small job) | Ramp up, burst, release | 16–64 CU | 5–30 min |
| Data Engineering (Spark large job) | Sustained high | 64–512 CU | 30 min–4 hours |
| Data Warehouse (T-SQL query) | Burst during query | 2–64 CU | Seconds to minutes |
| Real-Time Intelligence (KQL continuous) | Continuous low baseline | 2–8 CU constant | Continuous |
| Eventstream ingestion | Rate-proportional constant | 2–16 CU | Continuous |
| Data Factory pipeline (simple) | Low overhead + burst on activity | 1–8 CU overhead + data movement | Minutes to hours |
The Smoothing Model — Fabric's Burst Mechanism
Fabric does not hard-cap CU usage at your SKU limit. Instead, it uses a "smoothing window" of 10 minutes. If a workload bursts above the capacity limit briefly, Fabric borrows CU headroom from the smoothing window. A 10-minute smoothing window at F64 means the capacity can absorb short bursts up to 640 CU × 10 minutes = 6,400 CU-minutes of borrowed capacity. Workloads that continuously exceed the capacity limit are throttled.
Throttling occurs in stages: interactive workloads first receive a 25% slowdown. Sustained overuse leads to further delays and eventually job rejection. Background workloads (scheduled refreshes, Spark batch) are deprioritised and can be delayed by up to 24 hours before they fail. This staging allows the platform to stay functional during bursts — but it means capacity under-provisioning creates unreliable, variable-latency behaviour that degrades user experience gradually rather than failing hard.
The Capacity Metrics App: Your Sizing Instrument
The Microsoft Fabric Capacity Metrics App is a free Power BI template app available from AppSource. It connects to your Fabric capacity's usage data and provides the telemetry needed for right-sizing decisions. Install it on day 1 of any Fabric deployment or trial.
Key Metrics to Monitor
| Metric | What It Tells You | Action Threshold |
|---|---|---|
| % Capacity Utilisation (Average) | Overall capacity demand vs supply | Scale up if avg >70% sustained; scale down if <35% |
| % Capacity Utilisation (Peak) | Highest instantaneous demand in period | Investigate if peak >400% of capacity for >10 min |
| Throttling Events Count | How often capacity was exceeded and throttled | Any throttling of interactive workloads is unacceptable |
| CU by Workspace | Which workspaces consume most capacity | Identify top 3 consumers; optimise or isolate |
| CU by Item Type | Which workload type (Spark, Power BI, etc.) drives demand | Match optimisation effort to largest consumer |
| Background vs Interactive Split | Batch vs real-time workload balance | If background >70%, consider time-shifting jobs |
| Overload Duration | Total time capacity was in overloaded state | Reduce to <5% of operating hours |
The 30-Day Sizing Protocol
When sizing a new Fabric capacity or validating an existing one, follow this 30-day measurement protocol:
- Week 1–2 (baseline): Run representative workloads on trial F64 capacity. Do not restrict or modify workloads. Log all scheduled refreshes, ad-hoc queries, and Spark jobs.
- Week 3 (stress test): Run a parallel workload stress period — simulate end-of-month report runs, batch pipeline executions, and concurrent Power BI user sessions all occurring simultaneously. This reveals worst-case peak.
- Week 4 (analysis): Pull Capacity Metrics App data. Calculate P95 utilisation (not just average). Map throttling events to user complaints or job delays. Determine whether observed throttling would be acceptable in production.
- Sizing decision: If P95 utilisation is below 75% with no throttling of interactive workloads, F64 is right-sized. If P95 exceeds 85% or interactive throttling occurred, move to F128. If P95 is below 45%, consider F32 (check Power BI compatibility first — F32 has feature limitations vs F64 for paginated reports and ML).
Workload Profiles: Common Sizing Scenarios
Scenario 1 — Power BI-Dominant Organisation (BI Replacement)
Profile: Migrating from Power BI Premium P1, primarily Power BI reports and dashboards, 500–2,000 report consumers, 20–50 creators, minimal data engineering.
CU consumption analysis: Power BI refresh schedule drives the dominant CU pattern. If 30 semantic models refresh daily at 06:00, the burst period is 06:00–07:00 with 30 × 15 CU average = 450 CU instantaneous peak. This exceeds F64 (64 CU) and requires either F128 (128 CU) or staggering the refresh schedule across 3 hours (reducing peak to 150 CU, fitting F128 comfortably).
Cost optimisation: Stagger refreshes + F64 ($2,500/month reserved) vs unoptimised + F128 ($5,000/month reserved) = $30,000/year difference. A 2-hour refresh schedule optimisation generates significant ROI.
Scenario 2 — Data Engineering Platform (Analytics Migration)
Profile: Migrating from Azure Synapse + Azure Data Factory, 10–30 data engineers, Spark-heavy ETL, moderate Power BI, nightly batch window.
CU consumption analysis: Nightly Spark batch runs 22:00–05:00, consuming 200–400 CU for 7 hours. Daytime Power BI queries consume 20–60 CU. The temporal separation means F128 or F256 is the right sizing — not the additive sum, because peak workloads don't overlap. Workload management rules (deprioritise background during business hours) allow F128 to serve both use cases without premium F256 spend.
Scenario 3 — Real-Time Intelligence (Streaming Analytics)
Profile: Eventstream ingestion from IoT/telemetry, KQL dashboards, real-time alerting, Data Activator triggers.
CU consumption analysis: Continuous workloads have a baseline consumption floor. A high-volume Eventstream at 10,000 events/second consumes ~12 CU continuously. A KQL query-heavy dashboard with 200 concurrent users consumes burst CU per query. The continuous nature means smoothing doesn't help — you need sufficient capacity to handle the constant baseline plus query bursts. F64 minimum; F128 for high-volume streaming scenarios.
Get an Independent Second Opinion
Before you commit to a Fabric capacity SKU, have an independent adviser validate your sizing. Avoid Microsoft's tendency to over-recommend larger SKUs.
Request a Consultation →Multi-Capacity Architecture: When to Split
Single capacity is simpler and more efficient — workloads time-share the CU pool and the Capacity Metrics App provides unified visibility. Multiple capacities are appropriate in specific scenarios:
| Scenario | Architecture | Reason |
|---|---|---|
| Production vs Dev/Test isolation | 2 capacities: production + dev/test (smaller SKU) | Dev jobs should not impact production report SLAs |
| Regulatory data separation | Separate capacity for PII/regulated data | GDPR/HIPAA data residency; audit trail isolation |
| Department chargeback | Capacity per department | Accurate IT cost allocation requires separate meters |
| Geography/data residency | Capacity per Azure region | Capacities are region-scoped; data stays in region |
| Large enterprise: scale ceiling | Multiple F capacities | Maximum single F SKU is F2048; larger estates need sharding |
The most common multi-capacity pattern: F128 or F256 for production analytics + F8 or F16 for development. The production capacity runs with reserved pricing (3-year); the dev capacity runs pay-as-you-go with pause enabled — active only during business hours, paused overnight and weekends. Dev capacity effective monthly cost at 8 hours/day × 22 workdays = 176 hours: F8 at $0.76/hour = $134/month vs $360/month always-on.
Capacity Reservation Strategy: When to Commit
F SKU reserved capacity (1-year or 3-year) provides 35–47% discount vs pay-as-you-go. The break-even point for committing to reserved pricing:
| F SKU | PAYG ($/hour) | 1-Year Reserved ($/hour) | 3-Year Reserved ($/hour) | Break-Even vs PAYG (days/month of use needed) |
|---|---|---|---|---|
| F64 | $11.52 | $8.00 (31% discount) | $6.08 (47% discount) | 1-year: 21 days/month; 3-year: 16 days/month |
| F128 | $23.04 | $16.00 (31% discount) | $12.16 (47% discount) | 1-year: 21 days/month; 3-year: 16 days/month |
| F256 | $46.08 | $32.00 (31% discount) | $24.32 (47% discount) | 1-year: 21 days/month; 3-year: 16 days/month |
Production analytics capacities running 24/7 should be on 3-year reserved pricing. Development/test capacities that are paused overnight and on weekends (active ~176 hours/month out of 720) should stay on pay-as-you-go — reserved pricing only makes sense at >60% utilisation of the reserved period.
📄 Free Guide: Azure Cost Optimisation Guide
Includes Fabric reserved capacity strategy, MACC alignment, and Azure Reserved Instance frameworks for enterprise analytics.
Download Free Guide →Frequently Asked Questions
How do I size a Microsoft Fabric F SKU?
Use the Fabric Capacity Metrics App (free from AppSource) to measure actual CU consumption across all workloads over a representative 2–4 week period. Target 60–70% average utilisation with sufficient headroom for peak bursts. If your average utilisation exceeds 80% on an F64, move to F128.
What happens when Fabric capacity is exceeded?
Fabric uses a 'smoothing' model rather than hard throttling. Short bursts above capacity are absorbed. Sustained over-utilisation triggers throttling: interactive workloads experience delays (25% slowdown initially, escalating to job rejection). Background workloads can be delayed up to 24 hours.
What is the Fabric Capacity Metrics App?
The Microsoft Fabric Capacity Metrics App is a free Power BI template app from AppSource that displays CU consumption by workload, item, and time period. It is the essential right-sizing tool. Install it immediately when starting any Fabric deployment.
Can I run multiple workspaces on one Fabric capacity?
Yes. Multiple workspaces share the CU pool. This enables compute time-sharing — overnight Spark jobs and daytime Power BI reports share capacity without conflict. Workload management settings allow prioritisation of interactive over background workloads.
Should I use one large Fabric capacity or multiple smaller ones?
One large capacity is generally more efficient. Multiple capacities are appropriate for production/dev isolation, regulatory data separation, department chargeback, or geographic distribution. Most enterprises run 2–3 capacities: production, dev/test, and optionally a regulated-data capacity.