Hi everyone,
I’m currently managing a Microsoft Fabric F128 capacity with multiple workspaces assigned to it, and over the last month I’ve been facing frequent capacity breaches, mainly due to interactive user operations. I understand the basics, but I’m struggling to clearly understand how Fabric capacity behaves once it starts getting heavily used.
I’m hoping the community can help me clarify the following:
How exactly do interactive queries work in Fabric, and how can a single report page or visual push capacity usage to 200%–250%, even when the capacity is already at 100%?
What is the exact sequence after capacity reaches 100%—how do smoothing, throttling, and interactive query rejection work, and when does Fabric decide to move from one stage to the next?
How does the “burnout” (over-consumption) mechanism actually work—when is excess capacity usage burned off, and what happens if interactive activity continues during that period?
If a specific visual or page is identified as the main consumer, what typically makes a visual so expensive in Fabric that it can dominate overall capacity usage?
Could dynamic Row-Level Security (RLS) be a major contributor to high interactive capacity usage, and are there known best practices to reduce its impact in Fabric? (I am using a search function in one of my RLS roles)
At the moment, it feels a bit chaotic to manage—capacity spikes, throttling , user-facing delays, and difficulty explaining why it’s happening in a simple way.
Please help me to understand it from the scratch
I haven't had to manage capacity so I can't give you direct answers. But,
Here is my Fabric Licensing from Scratch post:
https://www.sqlgene.com/2024/12/07/fabric-licensing-from-scratch/
An F128 can burst up to 12x consumption.
https://learn.microsoft.com/en-us/fabric/data-warehouse/burstable-capacity#sku-guardrails
Interactive operations are smoothed (also knwon as "burndown") over a 5 minute period. Background ones are done over a 24 hour period.
https://learn.microsoft.com/en-us/fabric/data-warehouse/compute-capacity-smoothing-throttling#smoothing
Throttling doesn't happen for the first 10 minutes, as part of overage protection. Then it delays requests by 20 seconds for the next hour. Then it rejects interactive jobs. Then it rejects everything.
https://learn.microsoft.com/en-us/fabric/enterprise/throttling#throttle-triggers-and-throttle-stages
Taking a guess, I would anticipate huge amounts of memory usage, extremely poor usage of the storage engine instead of the formula engine, very large models, or bi-directional filtering to be possible causes of heavy load. It should be reasonably easy to walk through a Power BI report with performance analyzer on and identify likely culprits.
RLS is functionally a DAX filter that is being pre-applied. They should not be dynamic, aside from a simple filter comparing against USERNAME / USERSERVICEPRINCIPAL. SEARCH is likely extremely expensive because RLS tends to be applied row by row, which can be expensive. It should be easy to simulate the impact with DAX Studio and some custom DAX queries.
You can use KSQL and Log analytics to identify the source of your performance problem.
https://blog.crossjoin.co.uk/2025/09/14/how-to-get-the-details-of-power-bi-operations-seen-in-the-capacity-metrics-app/
Heavy PQ transformations is always the biggest culprit
Cool, good to know. I've done a lot of performance tuning, but never in regard to a premium capacity.
Not true, PQ transformations are background CU usage not interactive.
Not necessarily true. It depends on the Capacity size. On anything bigger than F32, it would be the interactive operations that cause the spikes, at least in my experience.
Thanks a lot for this insightful reply . So like you said If I change it to the basic RLS like [email] = userprincipalname () . Can I expect a considerable difference in capacity consumption.
Yes, but it's proportional to the side of the table you were filtering to begin with. If it was a thousand row dimension, probably not a big deal. If it's a 100 million row, the change is going to be immense.
Often times you can also improve performance by moving your filtering up from the fact table to a dimension table.
My favourite one so far which was a notebook set up so badly it took 12 hrs to run at 100% of an F64. It was BAAAD. Rengineered properly, ran in 6 minutes not cracking 30%.
Love such optimization comments. Q. 30% of F64 still feels like a lot for notebook? Curious to know what the notebook does?
Can't say unfortunately l. I just wanted to let people know that enthusiastic amateurs can easily get it THAT wrong.
Considering notebooks hit background usage, that would mean it's 30% of an F64 for an entire 24 hours. Which prices that notebook at $1.5k/month lol
Concurrent user sessions being active can lead to overconsumption.
It will break these session after some time throwing error to users.
If problem isn't addresses it will repeat.
one vis with bad measures, crazy filters or with personalized view can easily eat all the resources
Yes. If it's plugged on transactional data instead of dimensions it will consume much more capacity. So it's important not only how your RLS looks like, but where it is applied. DAX studio can help you with the impact, it can simulate RLS too
Question:
1.capacity will allow you to burn down the debt in the future time points pots in the next 5 minutes ( 10 timepots) right. If it doesn't happen throttling starts . What is the use of this. How it helps the debt to burn down faster . Still users will come and interact with the report without knowing all these right?
2.could you please recommend a good vlog or material on Dax studio
I strongly recommend the blog of Chris Webb, that's my bible for plenty of the things that can crash a capacity. You never mentioned what kind of artifacts you have there - is it PBI only or Fabric too? Next to that - have you heard of the possibility to link workspaces on Log Analytics to get the events that take place. In this way I can very quickly go from Capacity Metrics to Details to extract the Operation ID, check the blog above on that, then I query Log Analytics for that event and I focus on the steps CommandBegin, CommandEnd, and Execution Metrics. The first two will help you see what DAX was executed. There's a simpler approach too - with your Fabric Admin rights active, you can click on Workspaces and you'll see a secret one, called Admin Workspace, which has four reports currently. One of them, Usage insights, if I remember correctly, has statistics on performance, so you can track how much compute workspaces/reports/users consumed. There's a detailed page where you will find individual queries, not as DAX, but user name, artifacts, performance and a few more things. Finally, there's the very nice FUAM solution, that everyone finds quite useful, so you can give it a try too. P.s. in my tenant, a single visual almost broke a F256 due to Contains operations in a high cardinality string column :) So you're not alone
What kind of datasize are you dealing with? DM me , I am more than interested to help for free for couple of hours.
So many potential issues. My first thought if this is newer behavior is to try and determine what has changed.
https://preview.redd.it/w6ok408rxt7g1.png?width=320&format=png&auto=webp&s=eb03c8f2c73a829e5ab51dc40d6e3b955ddce332
If you are running models in Direct Lake mode, one thing to watch out for is fallback to Direct Query. You can spot it using the Performance Analyzer (one of the queries below showing Direct Query). Adding a warehouse view to a model is one of a number of ways to trigger a fallback.
Other thoughts are the usual suspects like someone using a table filter in a Calculate. Again, Performance Analyzer might help with finding an expensive measure then inspecting the code.
Running Best Practice Analyzer on hungry models might illuminate things.
If interactive queries are spiking that high, it is usually bad data models / bad DAX / excessive amount of DAX. Going to the report usually makes it clear with visuals taking 30+ seconds to load.
Expensive Power Query operations will show up in the background metric.
As always it depends, but I usually see this as DAX in visuals performing operations that could be pushed back up to SQL or into Power Query. Pay the cost once during refresh instead of every time a user looks at or interacts with a visual.
On the other hand I have seen a report creator use crazy DAX like this because they were trapped, that was all that was available to them, for one reason or another. (SAP source system, any Power query except for the most basic filtering was extremely dangerous and unreliable)
Best is to download PBIX, look at the visual queries in DAX Studio. Look for how much the storage engine versus formula engine is being used. Even smallish tables can generate huge row counts of intermediate results if cross joins happen. Sometimes the tiniest tweaks to DAX measures can result in the same answer, but 99% less row counts in between.
IF statements combined with BLANK management in DAX are a common culprit for performance issues.