Get the FREE Ultimate OpenClaw Setup Guide →

data-migration-validator

npx machina-cli add skill a5c-ai/babysitter/data-migration-validator --openclaw
Files (1)
SKILL.md
2.7 KB

Data Migration Validator Skill

Validates data integrity throughout the migration process with comprehensive verification checks and reconciliation reporting.

Purpose

Enable data validation for:

  • Row count validation
  • Checksum verification
  • Sample data comparison
  • Referential integrity checking
  • Business rule validation

Capabilities

1. Row Count Validation

  • Compare source/target counts
  • Track by table/partition
  • Identify discrepancies
  • Generate count reports

2. Checksum Verification

  • Calculate table checksums
  • Compare hash values
  • Identify data drift
  • Verify data consistency

3. Sample Data Comparison

  • Random sample selection
  • Field-by-field comparison
  • Statistical sampling
  • Confidence scoring

4. Referential Integrity Checking

  • Verify foreign keys
  • Check orphaned records
  • Validate relationships
  • Report violations

5. Business Rule Validation

  • Apply custom rules
  • Check data constraints
  • Verify transformations
  • Validate calculations

6. Reconciliation Reporting

  • Generate audit reports
  • Track discrepancies
  • Document exceptions
  • Provide sign-off reports

Tool Integrations

ToolPurposeIntegration Method
Great ExpectationsData validationLibrary
dbt testsTransform validationCLI
Custom SQLDatabase checksCLI
DataGripManual verificationGUI
Apache GriffinData qualityAPI

Output Schema

{
  "validationId": "string",
  "timestamp": "ISO8601",
  "results": {
    "rowCounts": {
      "tables": [
        {
          "name": "string",
          "source": "number",
          "target": "number",
          "match": "boolean"
        }
      ]
    },
    "checksums": {
      "tables": [],
      "overall": "string"
    },
    "samples": {
      "checked": "number",
      "matched": "number",
      "discrepancies": []
    },
    "referentialIntegrity": {
      "valid": "boolean",
      "violations": []
    },
    "businessRules": {
      "passed": "number",
      "failed": "number",
      "failures": []
    }
  },
  "summary": {
    "status": "passed|failed|warning",
    "score": "number"
  }
}

Integration with Migration Processes

  • database-schema-migration: Post-migration validation
  • cloud-migration: Data validation

Related Skills

  • schema-comparator: Pre-migration comparison
  • etl-pipeline-builder: Migration execution

Related Agents

  • data-integrity-validator: Orchestrates validation
  • database-migration-orchestrator: Uses for verification

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/code-migration-modernization/skills/data-migration-validator/SKILL.mdView on GitHub

Overview

Data Migration Validator ensures data integrity throughout the migration by performing row-count validation, checksum verification, sample data comparisons, referential integrity checks, and business rule validation. It produces reconciliation reports and an auditable output schema to document discrepancies and sign-off readiness.

How This Skill Works

The tool compares source and target data per table, calculates row counts and table checksums, performs random or statistical sampling for field-by-field comparisons, validates foreign keys and relationships, and applies custom business rules. Results are compiled into a standardized reconciliation output schema for audit and sign-off.

When to Use It

  • Before migration to establish baseline row counts, checksums, and validation rules per table.
  • During migration to detect drift and discrepancies in near real-time.
  • After migration to certify parity between source and target data across tables and partitions.
  • When validating complex transformations and business rules to ensure calculated fields and constraints are correct.
  • For regulatory audits and sign-off requiring a comprehensive reconciliation report.

Quick Start

  1. Step 1: Enumerate target tables and establish baseline row counts, checksums, and business rules.
  2. Step 2: Run the data-migration-validator workflow to compute row counts, checksums, sample data, referential integrity, and rule validation; generate the output schema.
  3. Step 3: Review the reconciliation report, investigate any discrepancies, and obtain sign-off.

Best Practices

  • Define per-table baselines (row counts and checksums) before starting migration.
  • Use random or statistically significant sampling for data comparisons and track sample size.
  • Run checksum verification with consistent hashing and seeding to detect drift reliably.
  • Validate referential integrity (foreign keys and orphan checks) during the migration window.
  • Automate reconciliation reporting and establish a formal sign-off workflow.

Example Use Cases

  • E-commerce order data migration: compare per-table row counts and run checksum verification; sample orders to verify field values and transformations.
  • Product catalog migration: validate referential integrity between products and categories; ensure no orphaned reference IDs.
  • Customer data transformation: apply business rules on derived fields (e.g., discount eligibility) and verify results match expectations.
  • Cloud-based data warehouse migration: leverage Great Expectations and dbt tests for end-to-end validation with automated reports.
  • ETL pipeline migration: perform regression checks with historical checksums to detect data drift after pipeline changes.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers