Automated Semantic Model Health Checks: BPA + Memory Analyzer + Lakehouse in One Fabric Notebook

Automated Semantic Model Health Checks with Fabric Notebook

Every Power BI semantic model starts clean. But over time — as measures pile up, relationships multiply, and formatting inconsistencies creep in — models silently degrade. By the time someone notices a slow report or a questionable DAX pattern, the damage is already done.

What if your models could check themselves — every week, automatically — and write the results to a place where you can query, compare, and trend them over time?

That’s exactly what this Fabric Notebook does. It combines three analyses into one script, saves everything to a Lakehouse, and runs on a schedule. No manual effort after the initial setup.

Why Run BPA and Memory Analyzer Together?

Microsoft Fabric comes with two built-in analysis notebooks: the Best Practice Analyzer (BPA) and the Memory Analyzer (also known as Vertipaq Analyzer). Most people run them separately, if they run them at all. But running them together gives you the full picture of your model’s health:

  • BPA catches structural problems — missing format strings, duplicate measures, hidden foreign keys not set, unused objects, DAX anti-patterns. It tells you what’s wrong with how the model is built.
  • Memory Analyzer shows you where the weight is — which tables, columns, and segments consume the most memory. A model can pass every BPA rule and still be bloated because of high-cardinality columns or unnecessary detail data.

Together, they answer both questions: “Is this model built correctly?” and “Is this model built efficiently?”

On top of that, the script also captures seven INFO.* DAX system views — tables, columns, measures, relationships, storage tables, storage columns, and storage segments. These are the raw metadata that power tools like DAX Studio’s Vertipaq Analyzer, but now they live in your Lakehouse as queryable Delta tables.

Direct Lake Models: What You Need to Know

This notebook works with both Import and Direct Lake semantic models — but the Memory Analyzer results mean different things depending on which mode you’re using.

Import Models: Stable, Predictable Memory

For Import models, memory analysis is straightforward. The entire dataset is loaded into memory at refresh time and stays there. When you run Memory Analyzer, you see the full footprint — every table, every column, every segment. The results are stable and reproducible.

Direct Lake Models: Dynamic, Query-Driven Memory

Direct Lake models load data on demand — only the columns and rows that are actually queried get paged into memory. This means the Memory Analyzer shows you a snapshot of what’s currently cached, not the theoretical maximum size of the model.

If you run the analysis right after opening the semantic model, you’ll see almost nothing in memory — the cache is cold. To get realistic memory metrics for a Direct Lake model:

  1. Warm up the cache first. Run your most common reports or execute representative DAX queries to load the frequently accessed columns.
  2. Run the notebook after typical usage. Schedule it for a time when the model has been actively queried during business hours — not at 3 AM when the cache is empty.
  3. Interpret the results as “working set” size. The memory footprint you see reflects what’s needed to serve recent queries, not the full lakehouse table size. This is actually more useful for capacity planning — it tells you the realistic memory demand.

BPA rules, on the other hand, work identically for both Import and Direct Lake — they analyze the model’s structure (measures, relationships, formatting), not the data in memory.

The Three-Step Pipeline

The notebook runs sequentially through three steps:

Step 1: Best Practice Analyzer

Uses fabric.run_model_bpa() to evaluate the semantic model against the standard rule set. Results include the rule category (Formatting, Maintenance, Performance, DAX Expressions), severity, affected object, and a description of the issue.

BPA Results showing severity breakdown by category
BPA output — 116 rules checked across DAX Expressions (12), Formatting (27), Maintenance (45), and Performance (15) categories with severity breakdown

Step 2: Memory Analyzer

Uses fabric.model_memory_analyzer() to produce a tabbed summary: Model Summary, Tables, Partitions, Columns (Total Size), Columns (Temperature), Relationships, and Hierarchies. This is the same analysis DAX Studio provides — but running directly inside Fabric.

Memory Analyzer output with Model Summary
Memory Analyzer — Model Summary showing dataset size (381.91 KB), 7 tables, 27 columns, and INFO table results

Step 3: DAX INFO Views

Seven EVALUATE INFO.*() queries capture the full internal metadata of the model and save each as a Delta table:

  • INFO.TABLES() → table-level metadata (row counts, modified times, storage IDs)
  • INFO.COLUMNS() → column-level details (data types, cardinality hints)
  • INFO.MEASURES() → all measures with their DAX expressions
  • INFO.RELATIONSHIPS() → relationship definitions and cardinality
  • INFO.STORAGETABLES() → physical storage table metadata
  • INFO.STORAGETABLECOLUMNS() → storage-level column statistics
  • INFO.STORAGETABLECOLUMNSEGMENTS() → segment-level compression details

Each result set gets an analysis_timestamp, model_name, and workspace_name column appended before saving — so you can track multiple models over time in the same Lakehouse.

Why Save to a Lakehouse?

Running an analysis notebook and looking at the output once is useful. But the real value comes from persistence:

  • Trend analysis. When BPA results are appended weekly, you can track whether your model is getting better or worse. Did someone introduce 12 new unformatted measures last sprint? You’ll see it.
  • Cross-model comparison. Run the same notebook against multiple semantic models (just change the configuration variables). Query all results in one place to find which models need the most attention.
  • SQL Analytics Endpoint. Every Delta table in a Fabric Lakehouse is automatically exposed via a SQL endpoint. That means you can query BPA results from SSMS, Azure Data Studio, or even build a Power BI report on top of your model health data.
  • Audit trail. When stakeholders ask “how has model quality improved this quarter?”, you have the data to answer.
SQL query results from bpa_analysis_results table
Querying bpa_analysis_results via SQL in SSMS — DemoLakehouse with all 8 Delta tables, showing Category, Rule_Name, Severity, and other columns

Why Schedule It?

A one-time health check is a snapshot. A scheduled health check is a monitoring system.

Fabric Notebooks support native scheduling — no Data Factory pipeline needed. From the top ribbon, click Schedule, set it to Weekly, pick a day and time, and save. The notebook runs unattended, appends results to your Lakehouse, and you’re done.

Fabric Notebook schedule configuration
Schedule setup — Weekly on Monday at 12:00 PM, Brussels timezone, running from April 2026 to April 2027

This means:

  • No manual intervention. Models are checked every week without anyone clicking “Run”.
  • Early warning system. New BPA violations show up within a week of being introduced.
  • Memory regression detection. If a model’s memory footprint suddenly doubles after a data source change, the INFO tables will show it in the next scheduled run.
  • Compliance documentation. Some organizations require periodic checks on data model quality. A scheduled Lakehouse write creates an immutable record.

Setup: Five Minutes to Automated Model Monitoring

Here’s the complete setup process:

  1. Create a Lakehouse in your Fabric workspace (or use an existing one).
  2. Create a new Notebook and attach the Lakehouse as the default (left panel → Lakehouses → Add).
  3. Paste the script from GitHub or the Fabric Community Notebook Gallery.
  4. Set the three config variables:
dataset = "Your Semantic Model Name"
workspace = "Your Workspace Name"
lakehouse = "Your_Lakehouse_Name"
  1. Run once to verify everything works.
  2. Schedule via Run → Schedule → Weekly.

The full notebook code is available on GitHub.

What You Get: 8 Delta Tables

After a successful run, your Lakehouse contains:

Table Source Content
bpa_analysis_results BPA All rule violations with severity, category, object info
info_tables INFO.TABLES() Table-level metadata (row counts, storage IDs, modified times)
info_columns INFO.COLUMNS() Column definitions, data types, cardinality
info_measures INFO.MEASURES() Measure names and DAX expressions
info_relationships INFO.RELATIONSHIPS() Relationship definitions and cardinality types
info_storage_tables INFO.STORAGETABLES() Physical storage table metadata
info_storage_columns INFO.STORAGETABLECOLUMNS() Storage-level column statistics
info_storage_segments INFO.STORAGETABLECOLUMNSEGMENTS() Segment-level compression and encoding details

Each table includes analysis_timestamp, model_name, and workspace_name columns — so you can filter, compare, and trend across models and time periods.

Credits

The DAX INFO.* views approach was inspired by Hariharan’s Vertipaq Analyzer queries. SQL view equivalents are available in my PBI-Tools repository. Thanks to Bernat Agulló for pointing me in the right direction.

Resources

Leave a Reply