POWER BI: DATAFLOWS – ENHANCED COMPUTE ENGINE (ECE)

|
Tony Consentino
|
7 minute read

It is common to compare SQL Server Analysis Services (aka SSAS or AAS) to the DataFlow Enhanced Compute Engine, as many believe performance may be lacking today in DataFlows. However, this is inaccurate, Microsoft are actively migrating all SSAS functionality to Power BI. In fact, a dataset sits on top of a typical SSAS tabular model.

As you may know, Power BI is an enterprise-ready BI platform through which you can transform your company data into actionable analytics. A key pillar is dataflows (which I’m a huge advocate for), self-service data preparation which is designed to collect, clean, combine and enrich your company data. To improve the scale and performance of these tasks, the enhanced compute engine is available for all newly created DataFlows. This functionality is only available to Power BI Premium users to:

  1. Speed up refresh operations like performing joinsdistinctfilters, and group by.
  2. Have DirectQuery access over dataflows which can utilise the ECE for performance.
  3. Access any ETL(Extract, Transform, Load) based transformation step within dataflows to use a SQL Server-based cached. This includes any SQL compute operations that need query folding like count, join, sum etc.

Query folding is Power Query generating a single query statement to retrieve and transform source-based data. The Power Query mashup engine tries to perform query folding whenever possible to get maximum efficiency and performance. If your data source is OData or a CSV file in blob storage, then query folding may not be supported. However, a data source like SQL Server or a relational database will support query folding. For more information, see https://docs.microsoft.com/en-us/power-query/power-query-folding.
 
20X Performance Boost (with millions of rows of data)
How do you get up to 20X performance boost with loading and querying your company data? Simply switch on the ECE. To ensure your dataflow is available for DirectQuery access, the enhanced compute engine must be in its optimised state. To enable DirectQuery for dataflows, set the new Enhanced compute engine settings option to On.


 
The ECE loads your dataflow table data into a SQL-based cache inside the Power BI Service. Please note, you do not have access to the SQL-based cache regardless of your Premium licensing.  Using SQL clustered columnstore indices and various other clever optimisations, you can get up to a 20X performance boost. Computed entities and DirectQuery connections against the dataflow in Premium can then be fulfilled by reading from the cache instead of reading from storage (Azure CDM Gen2) and flat files (CSV and JSON) as Dataflows in Power BI Pro users do.


 
If you have millions or billions of rows of data, getting your reporting users a Power BI Premium license is a great idea. Microsoft have just announced a more affordable Premium license called “Premium Per User”, which will cost around $25 to $30 per user. It’s due to be release around April/May 2021.

If you'd like to learn more about Power BI capabilities and what it could do for your business, get in touch with us today to discuss your requirements and how to get started. 
Email Fiona - f[email protected] or call on 0415 870 135

Tony Consentino