What is characterization-test-generator?

A skill that automatically creates golden master and approval tests to capture current behavior before migration, including input/output recording, edge-case discovery, and regression baselines.

Which tools and frameworks are supported?

It supports Bash-based orchestration and can generate tests for frameworks like Jest, pytest, junit/nunit, with tooling such as ApprovalTests, Jest snapshots, pytest-snapshot, TextTest, Verify, Scientist, and AI Testing MCP when available.

How does it handle non-deterministic behavior?

It can document non-deterministic behavior and environmental dependencies, with options to stabilize or clearly record the variability to guide reliable test generation.

characterization-test-generator

npx machina-cli add skill a5c-ai/babysitter/characterization-test-generator --openclaw

Files (1)

SKILL.md

11.0 KB

Characterization Test Generator Skill

Generates characterization tests (also known as golden master tests or approval tests) to capture existing system behavior before migration, ensuring functional equivalence after changes.

Purpose

Enable behavior preservation during migration through:

Golden master test creation
Approval test generation
Edge case discovery
Input/output recording
Behavior snapshot capture
Regression baseline establishment

Capabilities

1. Golden Master Test Creation

Capture current system outputs for given inputs
Store outputs as reference "golden" files
Generate comparison test infrastructure
Support multiple output formats
Enable baseline versioning

2. Approval Test Generation

Create human-reviewable test outputs
Generate diff-based comparison tests
Support approval workflows
Enable incremental approval
Track approval history

3. Edge Case Discovery

Analyze code paths for boundary conditions
Generate boundary value inputs
Identify error handling scenarios
Map exception paths
Create comprehensive test coverage

4. Input/Output Recording

Instrument code for I/O capture
Record API request/response pairs
Capture database interactions
Log external service calls
Store timing and sequence data

5. Behavior Snapshot

Capture state at key points
Record side effects
Document non-deterministic behavior
Identify environmental dependencies
Map configuration impacts

6. Regression Baseline Establishment

Create baseline test suites
Define acceptance thresholds
Configure tolerance levels
Set up CI/CD integration
Generate coverage reports

Tool Integrations

This skill can leverage the following external tools when available:

Tool	Purpose	Integration Method
ApprovalTests	Approval testing framework	Library
Jest snapshots	JavaScript snapshot testing	Framework
pytest-snapshot	Python snapshot testing	Plugin
TextTest	Golden master testing	CLI
Verify	.NET approval testing	Library
Scientist	Safe refactoring library	Library
AI Testing MCP	AI-powered test generation	MCP Server

Usage

Basic Test Generation

# Invoke skill for characterization test generation
# The skill will analyze code and generate tests

# Expected inputs:
# - targetPath: Path to code to characterize
# - testFramework: 'jest' | 'pytest' | 'junit' | 'nunit' | 'auto'
# - outputDir: Directory for generated tests
# - captureMode: 'snapshot' | 'approval' | 'recording'

Generation Workflow

Analysis Phase
- Identify testable units (functions, methods, endpoints)
- Map input parameters and types
- Detect output formats
- Find external dependencies
Input Discovery Phase
- Extract existing test inputs
- Generate boundary values
- Identify representative cases
- Create combinatorial inputs
Capture Phase
- Execute code with inputs
- Record outputs and side effects
- Capture state changes
- Log external interactions
Test Generation Phase
- Generate test code structure
- Create golden master files
- Set up comparison logic
- Configure CI integration

Output Schema

{
  "generationId": "string",
  "timestamp": "ISO8601",
  "target": {
    "path": "string",
    "language": "string",
    "framework": "string",
    "unitsAnalyzed": "number"
  },
  "testsGenerated": {
    "total": "number",
    "byType": {
      "snapshot": "number",
      "approval": "number",
      "recording": "number"
    },
    "coverage": {
      "functions": "number",
      "branches": "number",
      "lines": "number"
    }
  },
  "characterizations": [
    {
      "unit": "string",
      "type": "function|method|endpoint|class",
      "inputs": [
        {
          "name": "string",
          "type": "string",
          "values": ["any"],
          "source": "existing|generated|boundary"
        }
      ],
      "outputs": {
        "type": "string",
        "goldenMasterPath": "string",
        "checksum": "string"
      },
      "testFile": "string",
      "baselineApproved": "boolean"
    }
  ],
  "edgeCases": [
    {
      "unit": "string",
      "case": "string",
      "input": "any",
      "expectedBehavior": "string",
      "covered": "boolean"
    }
  ],
  "dependencies": {
    "external": ["string"],
    "mocked": ["string"],
    "recorded": ["string"]
  },
  "artifacts": {
    "testSuite": "string",
    "goldenMasters": "string",
    "recordings": "string",
    "coverageReport": "string"
  }
}

Integration with Migration Processes

This skill integrates with the following Code Migration/Modernization processes:

migration-testing-strategy: Primary tool for test baseline creation
code-refactoring: Ensure behavior preservation during refactoring
framework-upgrade: Verify functionality after upgrade
monolith-to-microservices: Validate service extraction

Configuration

Skill Configuration File

Create .characterization-tests.json in the project root:

{
  "testFramework": "auto",
  "outputDir": "./tests/characterization",
  "captureMode": "snapshot",
  "goldenMasterDir": "./tests/golden-masters",
  "inputGeneration": {
    "boundary": true,
    "combinatorial": true,
    "maxCombinations": 100,
    "includeNulls": true,
    "includeEmpty": true
  },
  "outputCapture": {
    "format": "json",
    "normalizeWhitespace": true,
    "ignorePaths": ["$.timestamp", "$.requestId"],
    "tolerance": {
      "numeric": 0.001,
      "dateTime": "1s"
    }
  },
  "dependencies": {
    "mockExternal": true,
    "recordMode": false,
    "replayMode": true
  },
  "approval": {
    "requireApproval": true,
    "approvalDir": "./tests/approvals",
    "reportFormat": "markdown"
  },
  "ci": {
    "failOnNewTests": false,
    "updateGoldenOnPass": false,
    "generateCoverageReport": true
  }
}

MCP Server Integration

When AI Testing MCP Server is available:

// Example AI-powered test generation
{
  "tool": "ai_testing_generate",
  "arguments": {
    "target": "./src/services/user.ts",
    "framework": "jest",
    "style": "characterization"
  }
}

Test Patterns

Snapshot Testing (Jest)

// Generated characterization test
describe('UserService', () => {
  describe('calculateDiscount', () => {
    it('should match snapshot for standard customer', () => {
      const result = userService.calculateDiscount({
        customerId: 'C001',
        purchaseAmount: 100,
        loyaltyPoints: 500
      });
      expect(result).toMatchSnapshot();
    });

    it('should match snapshot for premium customer', () => {
      const result = userService.calculateDiscount({
        customerId: 'C002',
        purchaseAmount: 100,
        loyaltyPoints: 5000,
        isPremium: true
      });
      expect(result).toMatchSnapshot();
    });

    // Edge cases
    it('should match snapshot for zero amount', () => {
      const result = userService.calculateDiscount({
        customerId: 'C001',
        purchaseAmount: 0,
        loyaltyPoints: 0
      });
      expect(result).toMatchSnapshot();
    });
  });
});

Approval Testing (ApprovalTests)

// Generated approval test
public class UserServiceCharacterizationTest {

    @Test
    public void calculateDiscount_standardCustomer() {
        UserService service = new UserService();
        DiscountResult result = service.calculateDiscount(
            new DiscountRequest("C001", 100.0, 500)
        );
        Approvals.verify(result);
    }

    @Test
    public void calculateDiscount_boundaryValues() {
        UserService service = new UserService();

        // Boundary: minimum values
        Approvals.verify(service.calculateDiscount(
            new DiscountRequest("C001", 0.01, 0)
        ), "minimum");

        // Boundary: maximum values
        Approvals.verify(service.calculateDiscount(
            new DiscountRequest("C001", 999999.99, 999999)
        ), "maximum");
    }
}

Recording Tests (Python)

# Generated recording-based test
import pytest
from tests.recordings import PlaybackRecorder

class TestUserServiceCharacterization:

    @pytest.fixture
    def recorder(self):
        return PlaybackRecorder('tests/recordings/user_service')

    def test_get_user_profile_recorded(self, recorder):
        """Replay recorded external API interactions"""
        with recorder.playback('get_user_profile_c001'):
            service = UserService()
            result = service.get_user_profile('C001')

            assert result == recorder.expected_output()

    def test_update_user_settings_recorded(self, recorder):
        """Verify database interactions match recording"""
        with recorder.playback('update_settings_c001'):
            service = UserService()
            result = service.update_settings('C001', {'theme': 'dark'})

            recorder.verify_database_calls()
            assert result == recorder.expected_output()

Edge Case Categories

Boundary Values

Minimum/maximum values
Zero values
Empty strings/arrays
Null/undefined values
Type boundaries (INT_MAX, etc.)

Error Conditions

Invalid inputs
Missing required fields
Type mismatches
Constraint violations
Authentication failures

Concurrent Scenarios

Race conditions
Deadlocks
Timeout scenarios
Partial failures
Retry behaviors

Environmental

Configuration variations
Timezone differences
Locale changes
Feature flag states
Permission levels

Best Practices

Start Before Changes: Generate characterization tests before any migration work
Capture Everything: Record all outputs, side effects, and state changes
Version Golden Masters: Store golden masters in version control
Review Approvals: Human review of initial golden masters is essential
Handle Non-Determinism: Normalize timestamps, IDs, and random values
Incremental Updates: Update baselines incrementally as changes are approved
CI Integration: Fail builds on unexpected behavior changes

Related Skills

test-coverage-analyzer: Analyze coverage gaps
migration-validator: Validate migration results
static-code-analyzer: Identify testable code paths

Related Agents

migration-testing-strategist: Uses this skill for test strategy
regression-detector: Uses this skill for regression detection
parallel-run-validator: Uses this skill for comparison testing

References

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/code-migration-modernization/skills/characterization-test-generator/SKILL.md

View on GitHub

Overview

Characterization-test-generator creates golden master (approval) tests to capture current behavior before migration, ensuring functional equivalence after changes. It supports edge-case discovery, input/output recording, and regression baseline establishment to guide safe refactors across codebases.

How This Skill Works

The tool analyzes code to identify testable units, inputs, and outputs, then runs with captured data to generate test scaffolds and golden files. It integrates with popular frameworks and testing utilities (e.g., Jest snapshots, pytest-snapshot, ApprovalTests) to enable diff-based comparisons and approval workflows across environments.

When to Use It

Before migrating a legacy module or service to a new tech stack to preserve existing behavior.
When establishing regression baselines for critical APIs or business logic.
While recording and preserving external I/O (API calls, DB interactions, service calls) for future reference.
During edge-case discovery to surface boundary conditions, error paths, and exception handling.
To enable CI/CD integration with golden master approvals and test-coverage reporting.

Quick Start

Step 1: Provide inputs to the skill such as targetPath, testFramework, outputDir, and captureMode.
Step 2: Run the generation to analyze code, discover inputs/outputs, and emit tests and golden files.
Step 3: Review the generated tests, commit golden masters, and integrate into CI/CD with diff-based approvals.

Best Practices

Start with well-scoped, stable units to reduce churn in goldens.
Capture representative inputs and boundary values, including edge cases.
Version golden master files and review changes with stakeholders before merging.
Aim for deterministic tests by controlling environment and non-deterministic factors.
Integrate generated tests into CI so regressions are caught automatically.

Example Use Cases

Migrating a legacy REST API to a microservice while preserving behavior with golden masters.
Creating approval tests for a financial calculation module to prevent drift.
Capturing API request/response pairs and side effects from a third-party payment gateway.
Establishing regression baselines for a database schema migration.
Documenting non-deterministic behavior and environmental dependencies during refactoring.

Frequently Asked Questions

Add this skill to your agents