What is the RED phase?

The RED phase requires writing tests that define the expected behavior and ensuring they fail (exit code 1). Tests should run in CI mode (CI=true or --run) with a 60-second timeout and must be accompanied by evidence of failure.

How is the 80% coverage enforced?

Coverage is measured across statements, branches, functions, and lines. The minimum overall threshold is 80%, and if it isn’t met, you must write additional tests and repeat up to a maximum of three convergence iterations.

What happens if tests pass but coverage is below 80%?

You must extend tests to cover gaps until the 80% threshold is reached. Do not proceed to subsequent phases until the coverage gate is satisfied.

tdd-enforcement

npx machina-cli add skill a5c-ai/babysitter/tdd-enforcement --openclaw

Files (1)

SKILL.md

1.7 KB

TDD Enforcement

Overview

Strict test-driven development enforcement adapted from the Everything Claude Code methodology. Mandates the Red-Green-Refactor cycle with evidence-based verification at each phase.

TDD Process

1. RED Phase - Write Failing Tests

Write tests that define expected behavior
Tests MUST fail (exit code 1)
Use CI=true or --run flag, never watch mode
Apply timeout guards (60s) to prevent hanging
Record exit code as evidence

2. GREEN Phase - Minimal Implementation

Write the minimal code to make tests pass
Do NOT add features not covered by tests
Do NOT optimize prematurely
Tests MUST pass (exit code 0)
Record exit code as evidence

3. REFACTOR Phase - Quality Improvement

Apply SOLID principles and clean code patterns
Improve naming, reduce coupling
Remove duplication
Run tests after EACH refactoring step
Tests MUST remain passing (exit code 0)

4. Coverage Gate

Measure coverage: statements, branches, functions, lines
Minimum 80% overall coverage required
Iterate: write additional tests for gaps until threshold met
Maximum 3 convergence iterations

Rules

Never skip the RED phase
Never accept GREEN without exit code 0
Never use watch mode in CI
Always record evidence (exit codes, coverage numbers)
Enforce 80% coverage threshold

When to Use

All code implementation tasks
Feature development
Bug fixes (write regression test first)

Agents Used

tdd-guide (primary consumer)
code-reviewer (validates test quality)

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/methodologies/everything-claude-code/skills/tdd-enforcement/SKILL.md

View on GitHub

Overview

A strict TDD enforcement inspired by the Everything Claude Code methodology. It mandates the Red-Green-Refactor cycle with evidence-based verification at each phase, including exit codes and coverage metrics.

How This Skill Works

Developers follow three phases: RED writes failing tests and records exit code 1; GREEN implements the minimal code to make tests pass with exit code 0; REFACTOR improves code quality, applies SOLID principles, reduces coupling, and re-runs tests after each change. A Coverage Gate then enforces a minimum 80% overall test coverage, iterating up to three times until met.

When to Use It

All code implementation tasks
Feature development
Bug fixes (write regression test first)
CI pipelines requiring mandatory failing tests and evidence
Projects enforcing a measurable 80% test coverage gate

Quick Start

Step 1: RED Phase – write failing tests that define expected behavior; ensure tests exit code 1; use CI=true or --run; apply 60s timeout; record exit code as evidence
Step 2: GREEN Phase – implement the minimal code to make tests pass; ensure tests exit code 0; do not add features outside test scope; record exit code
Step 3: REFACTOR + Coverage – apply SOLID principles and clean patterns; improve naming and reduce duplication; run tests after each refactor; verify exit code stays 0; measure coverage and iterate up to 3 times until 80% is reached

Best Practices

Never skip the RED phase
Always run tests with CI=true or --run; never use watch mode in CI
Record exit codes and coverage numbers as evidence after each phase
In GREEN, implement only what tests cover; no extra features
In REFACTOR, apply SOLID principles, rename for clarity, reduce coupling, and remove duplication; run tests after every refactor

Example Use Cases

A new feature is implemented by first adding failing tests (RED), then writing the minimal code to pass (GREEN), followed by refactoring for clarity and maintainability (REFACTOR) while verifying 80%+ coverage
A bug fix includes a regression test written first, ensuring the failure is captured, then fixed code passes tests, with refactoring to reduce duplication
A codebase enforces an 80% coverage gate in CI, preventing merges until tests pass and coverage meets the threshold
A refactor that improves naming and reduces coupling is followed by additional tests to cover edge cases, maintaining exit code 0 across all phases
SOLID-compliant refactoring after tests pass, validated by coverage metrics and repeated test runs

Frequently Asked Questions

Add this skill to your agents