What is this skill for?

A systematic approach to diagnosing and remediating Dataflow Gen2 performance issues across refresh time, Fast Copy, staging, gateway latency, and capacity throttling.

What prerequisites are needed?

Microsoft Fabric workspace with Data Factory enabled, Contributor+ role, PowerShell 7+ with Az.Accounts, access to Fabric Monitoring Hub, and Fabric Capacity Metrics app.

How can I automate monitoring and health checks?

Use the REST API and PowerShell automation described in the skill to monitor refresh history, trigger checks, and alert on anomalies.

fabric-dataflows-perf-remediation

npx machina-cli add skill PatrickGallucci/fabric-skills/fabric-dataflows-perf-remediate --openclaw

Files (1)

SKILL.md

12.0 KB

Dataflows Gen2 Performance Troubleshooting

Systematic diagnostic workflows for identifying and resolving performance issues in Microsoft Fabric Dataflow Gen2 workloads covering data movement, transformation execution, staging configuration, and destination write optimization.

When to Use This Skill

Dataflow Gen2 refresh takes longer than expected
Fast Copy is not being utilized or is failing
Query folding indicators show red or yellow steps
Staging-to-destination data movement is a bottleneck
Gateway-connected dataflows have high latency
Incremental refresh is slower than full refresh
Capacity throttling affects dataflow execution
Refresh failures with permission or connector errors
Need to automate dataflow monitoring and health checks
Migrating from Dataflow Gen1 and seeing performance regressions

Prerequisites

Microsoft Fabric workspace with Data Factory enabled
Contributor or higher role on the workspace
PowerShell 7+ with Az.Accounts module for automation scripts
Access to Fabric Monitoring Hub for refresh history analysis
Fabric Capacity Metrics app access for CU consumption review

Quick Diagnosis: Symptom-to-Solution Map

Symptom	Likely Cause	Jump To
Refresh takes 2x+ longer than Gen1	Delta Parquet output overhead or staging misconfiguration	Consideration 5 in Performance Guide
Fast Copy not activating	Unsupported connector or non-foldable transforms	Fast Copy Diagnostics
"Insufficient permissions for staging artifacts"	Creator token expired (90+ days)	Common Errors
Slow design-time previews	Large dataset loaded in editor	Design-Time Optimization
Staging-to-Lakehouse write is slow	Extra data hop through staging Warehouse	Staging Strategy
Gateway dataflows are slow	All transforms running on gateway host	Gateway Optimization
Incremental refresh slower than full	Too many small buckets creating overhead	Incremental Refresh Tuning
HTTP 430 / capacity throttled	Concurrent Spark jobs exhausting CUs	Capacity Management

Diagnostic Workflow

Step 1: Check Refresh History

Open the Fabric workspace and locate the dataflow
Select the ellipsis (...) > Recent runs
Review the refresh history for status, duration, and type
Select a specific refresh Start time to drill into details
Examine the Tables section for per-entity timing
Examine the Activities section for destination write timing
Download detailed logs (bottom-left button) for deep analysis
Download CSV of refresh runs for trend analysis

Key metrics to capture: total duration, per-table duration, bytes read/written, rows read/written, engine type (Mashup vs CopyActivity vs SQL DW).

Step 2: Identify the Bottleneck Component

Dataflow Gen2 has three performance-critical components:

[Data Source] --> [Dataflow Engine] --> [Data Destination]
     |                  |                      |
  Connectors    Mashup / Fast Copy /      Lakehouse /
  + Gateway     SQL DW Compute            Warehouse /
                                          SQL Database

Data Source: Connector speed, network latency, gateway throughput
Dataflow Engine: Query folding, staging, Fast Copy, Mashup engine
Data Destination: Write performance, Delta Parquet conversion

Step 3: Apply Targeted Optimization

Based on the bottleneck identified, follow the relevant section below.

Fast Copy Diagnostics

Fast Copy provides up to 9x faster ingestion for supported scenarios.

Supported Connectors

ADLS Gen2, Azure Blob Storage, Azure SQL DB, Lakehouse, PostgreSQL, On-premises SQL Server, Warehouse, Oracle, Snowflake, SQL Database in Fabric.

Fast Copy Indicators

Indicator	Meaning	Action
Green	Step supported by Fast Copy	No action needed
Yellow	Some steps may support Fast Copy	Split query at boundary
Red	Step NOT supported by Fast Copy	Move to referenced query

When Fast Copy Is Disabled

Transformations beyond: select columns, change types, rename, remove columns (for file sources)
Non-supported connector in use
"Require Fast Copy" set on incompatible query (causes failure)
Destination is not Lakehouse (stage first, then reference)

Query Splitting Pattern for Fast Copy

Remove any red-indicator steps and the destination from the original query
Verify remaining steps show green indicators
Right-click the query > Enable staging
Right-click again > Reference to create a new query
Add back transformations and destination to the referenced query
Publish and refresh — first query uses Fast Copy, second uses SQL DW compute

See Performance Optimization Guide for detailed walkthrough.

Staging Strategy Decision Tree

Is destination a Warehouse?
├── YES → Staging REQUIRED (enabled by default)
│         Write goes directly via SQL DW compute
│         This is the optimal path for Warehouse destinations
└── NO → Is destination a Lakehouse?
    ├── YES with staging enabled → Data moves:
    │   Source → Staging LH → Staging WH → Lakehouse
    │   ⚠ CONSIDER: Disable staging to avoid extra hop
    │   OR: Switch destination to Warehouse instead
    └── YES with staging disabled → Data writes directly
        Source → Lakehouse (via Mashup engine)
        ✓ Fewer hops, but no SQL DW compute for transforms

Key Decision: If your transforms fold to the source, disable staging for Lakehouse destinations. If transforms are complex (joins, aggregations), either enable staging or switch to Warehouse destination.

Design-Time Optimization

When working with large datasets in the dataflow editor:

Use parameters for date filtering: Create a DesignDateFilter parameter to limit preview data during authoring, then adjust to full range before publish
Switch to Schema view: Select Schema view in the editor toolbar to see structure without loading data
Limit preview rows: Keep preview data small during development

Gateway Performance

When using on-premises data gateway:

Split dataflows: Separate data movement (gateway → cloud) from transformations
First dataflow: Use Fast Copy for high-throughput transfer from on-premises to Lakehouse/Warehouse staging
Second dataflow: Apply transformations using cloud compute on the staged data
Gateway version: Must be 3000.214.2+ for Fast Copy support; keep within last 6 supported versions
Detailed logs: Not yet supported for on-premises gateway refreshes (supported for cloud/VNet gateways)

Incremental Refresh Tuning

If incremental refresh is slower than full refresh:

Increase bucket size: Reduce total bucket count to lower partition management overhead
Evaluate data volume: For small datasets, full refresh may outperform incremental
Concurrency control: Adjust max concurrent requests in dataflow settings if source can't handle defaults
Monitor bucket efficiency: Check if most buckets process zero rows (indicates over-partitioning)

Capacity and Throttling

Dataflow Gen2 Compute Meters

Engine	When Used	Billing Basis
Standard Compute (Mashup)	Staging disabled or non-foldable queries	Query evaluation time
High Scale Compute (SQL DW)	Staging enabled	Lakehouse + Warehouse duration
Fast Copy	Supported connectors with Fast Copy enabled	Copy job duration

All operations are background operations smoothed over 24 hours.

CU Consumption Formula (Standard Compute)

If QueryDuration < 600s:
  CU_seconds = QueryDuration × 12
Else:
  CU_seconds = (QueryDuration - 600) × 1.5 + 600 × 12

Reducing Capacity Impact

Avoid inefficient Power Query logic (expensive merges and sorts)
Maximize query folding to push work to source systems
Disable staging for small data volumes or simple transforms
Don't refresh more frequently than source data updates
Use data destinations instead of dataflow connectors for consumption

Common Refresh Errors

Error	Cause	Resolution
"Insufficient permissions for staging artifacts"	Creator inactive 90+ days or left org	Have creator log in to Fabric; if gone, open support ticket
"Expression.Error: import matches no exports"	Unsupported connector in Premium workspace	Check connector compatibility list
"Gateway version not supported"	On-premises gateway out of support	Update to latest gateway version
"Staging lakehouse couldn't be found"	CI/CD branch workspace missing staging	Create a new Dataflow Gen2 in workspace to trigger staging creation
Validation failure on save	Query schema can't be determined in 10 min	Simplify query or check source connectivity

Modern Evaluator (Preview)

The Modern Evaluator engine can provide significant performance improvements:

Large data volumes: Shorter processing time, reduced memory usage
Complex transformations: Improved execution plans for joins across large tables
Frequent schedules: Cumulative time savings across multiple daily runs

Enable via dataflow settings. Monitor results after enabling — some connectors may not yet be fully optimized.

Automation and Monitoring

Run the Dataflow Health Check Script to programmatically audit dataflow configurations and recent refresh performance across a workspace.

Run the Dataflow Refresh Monitor Script to poll and track active refresh status in real-time.

See REST API Reference for complete API documentation covering CRUD operations, scheduling, refresh triggering, and monitoring.

See Performance Optimization Guide for deep-dive scenarios including Fast Copy benchmarks, staging architecture patterns, and query folding analysis techniques.

References

Performance Optimization Guide — Deep-dive optimization scenarios
Dataflow REST API Reference — API operations for automation
Get-DataflowHealthReport.ps1 — Workspace health audit
Watch-DataflowRefresh.ps1 — Real-time refresh monitor
Microsoft Learn: Best Practices for Dataflow Gen2 Performance
Microsoft Learn: Fast Copy in Dataflow Gen2
Microsoft Learn: View Refresh History and Monitor Dataflows
Microsoft Learn: Dataflow Gen2 Pricing

Source

git clone https://github.com/PatrickGallucci/fabric-skills/blob/main/skills/fabric-dataflows-perf-remediate/SKILL.mdView on GitHub

Overview

Diagnose and resolve Microsoft Fabric Dataflow Gen2 performance issues, including slow refreshes, Fast Copy optimization, and staging/destination bottlenecks. The guide provides diagnostic workflows, best practices, and automation approaches via REST API and PowerShell.

How This Skill Works

It uses a symptom-to-solution map across the Data Source, Dataflow Engine, and Data Destination, starting with refresh history analysis and per-entity timing. The diagnostic workflow targets bottlenecks, incremental refresh tuning, capacity management, and gateway issues, with automation hooks via REST API and PowerShell.

When to Use It

Dataflow Gen2 refresh takes longer than expected
Fast Copy is not activating or failing
Query folding indicators show red or yellow steps
Staging-to-destination data movement is a bottleneck
Gateway-connected dataflows have high latency

Quick Start

Step 1: Open the Fabric workspace, locate the dataflow, and open Recent runs to view the refresh history
Step 2: Drill into per-table and per-activity timing to identify the bottleneck component (Data Source, Engine, or Destination)
Step 3: Apply targeted remediation (e.g., tune incremental refresh, fix permissions, or configure Fast Copy) and set up automated health checks

Best Practices

Start with Refresh History to identify bottlenecks and trends
Validate Fast Copy compatibility and connector support for folding
Tune incremental vs full refresh and adjust bucket sizing
Monitor capacity usage and throttle effects; optimize CU consumption
Automate health checks and alerting via REST API and PowerShell

Example Use Cases

Troubleshooting a dataflow with slow refresh times due to staging bottlenecks
Fast Copy not activating due to unsupported connectors or non-foldable transforms
HTTP 430 capacity throttling caused by concurrent Spark jobs
Gateway latency where most transforms run on the gateway host
Automating refresh history monitoring with REST API and PowerShell

Frequently Asked Questions

Add this skill to your agents