Get the FREE Ultimate OpenClaw Setup Guide →

dbt

npx machina-cli add skill G1Joshi/Agent-Skills/dbt --openclaw
Files (1)
SKILL.md
1.0 KB

dbt (Data Build Tool)

dbt manages data transformation in the warehouse using SQL. v2.0 introduces the Fusion Engine (Rust) for performance.

When to Use

  • Data Modeling: Converting raw tables into "Gold" tables.
  • Testing: not_null, unique tests defined in YAML.
  • Documentation: Auto-generating data dictionaries.

Core Concepts

Models (.sql)

Select statements that dbt compiles into CREATE VIEW/TABLE.

Refs ({{ ref('users') }})

Dependency management. dbt builds the DAG automatically.

Semantic Layer

Defining metrics ("Revenue") in code so all BI tools use the same definition.

Best Practices (2025)

Do:

  • Use Git: Treat data models like software code.
  • Use Incremental Models: Only process new data to save cost.
  • Use dbt Mesh: For cross-project dependencies in large orgs.

Don't:

  • Don't put logic in BI tools: Put it in dbt.

References

Source

git clone https://github.com/G1Joshi/Agent-Skills/blob/main/skills/ai-ml/dbt/SKILL.mdView on GitHub

Overview

dbt orchestrates SQL-based data transformations inside your data warehouse, turning raw tables into reliable, analysis-ready models. It supports data modeling, testing, and automatic documentation, helping data teams deliver trusted analytics. In v2.0, the Fusion Engine (Rust) boosts performance for large pipelines.

How This Skill Works

Models are SQL files that dbt compiles into CREATE VIEW or TABLE statements. Refs ({{ ref('model') }}) manage dependencies and let dbt build a DAG automatically. The semantic layer lets you define metrics in code so all BI tools share the same definitions.

When to Use It

  • Data Modeling: Convert raw tables into Gold tables.
  • Testing: Enforce data quality with not_null and unique tests defined in YAML.
  • Documentation: Auto-generate a data dictionary for datasets.
  • Dependency Management: Use Refs to build and manage the DAG automatically.
  • Semantic Layer: Define metrics in code so BI tools use consistent definitions.

Quick Start

  1. Step 1: Install dbt and initialize a project (dbt init my_project).
  2. Step 2: Create .sql models and YAML tests; reference other models with {{ ref('model') }}.
  3. Step 3: Run dbt: dbt run; dbt test; dbt docs generate && dbt docs serve.

Best Practices

  • Use Git: Treat data models like software code.
  • Use Incremental Models: Only process new data to save cost.
  • Use dbt Mesh: For cross-project dependencies in large orgs.
  • Don't put logic in BI tools: Put it in dbt.
  • Automate tests and documentation to stay in sync with code.

Example Use Cases

  • Transform raw orders into a Gold orders table and incrementally update it.
  • Add not_null and unique tests in YAML to validate customer data.
  • Auto-generate a data dictionary for all models.
  • Define a Revenue metric in the semantic layer and feed it to BI tools.
  • Use {{ ref('orders') }} to model dependencies across multiple projects with dbt Mesh.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers