vibe-golden-file-testing
npx machina-cli add skill ash1794/vibe-engineering/golden-file-testing --openclawvibe-golden-file-testing
Golden file tests are powerful but brittle. This skill makes them robust.
When to Use This Skill
- Implementing tests that compare output against saved expected output
- Tests that currently break because dates, timestamps, or IDs change
- API response testing with dynamic fields
- CLI output testing
When NOT to Use This Skill
- Simple unit tests with static assertions
- Tests where the exact output IS the requirement (byte-for-byte)
- Performance benchmarks
The Problem
Golden tests break when they contain:
- Dates/timestamps (
2026-02-28→ different tomorrow) - UUIDs/IDs (random each run)
- Hostnames/ports (different per environment)
- File paths (absolute paths differ per machine)
- Durations (
took 1.23s→ varies by machine)
Steps
-
Identify dynamic fields in the output being tested
-
Create normalizer function:
func normalizeOutput(s string) string { // Dates: 2026-02-28 → REDACTED_DATE s = dateRegex.ReplaceAll(s, "REDACTED_DATE") // UUIDs: 550e8400-... → REDACTED_UUID s = uuidRegex.ReplaceAll(s, "REDACTED_UUID") // Timestamps: 1709136000 → REDACTED_TS s = tsRegex.ReplaceAll(s, "REDACTED_TS") // Durations: 1.23s → REDACTED_DURATION s = durationRegex.ReplaceAll(s, "REDACTED_DURATION") return s } -
Apply normalization to BOTH:
- The actual output (at test time)
- The golden file (at generation time)
-
Generate golden file with update flag:
if os.Getenv("UPDATE_GOLDEN") == "1" { os.WriteFile(goldenPath, normalized, 0644) } -
Document update command:
# To update golden files: UPDATE_GOLDEN=1 go test ./...
Common Normalizations
| Pattern | Regex | Replacement |
|---|---|---|
| ISO Date | \d{4}-\d{2}-\d{2} | REDACTED_DATE |
| ISO DateTime | \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2} | REDACTED_DATETIME |
| UUID | [0-9a-f]{8}-[0-9a-f]{4}-... | REDACTED_UUID |
| Unix timestamp | \b1[6-9]\d{8}\b | REDACTED_TS |
| Duration | \d+\.?\d*[µnm]?s | REDACTED_DURATION |
| Absolute path | /home/\w+/ or C:\\Users\\ | REDACTED_PATH |
| Port | :\d{4,5}\b | :REDACTED_PORT |
Output Format
Golden File Setup: [Test Name]
Dynamic fields found: X
Normalizations applied: Y
Golden file: [path]
Update command: UPDATE_GOLDEN=1 [test command]
Source
git clone https://github.com/ash1794/vibe-engineering/blob/master/skills/golden-file-testing/SKILL.mdView on GitHub Overview
Golden file tests are powerful but brittle. This skill makes them robust by normalizing dynamic fields (dates, IDs, hostnames, etc.) before comparisons. It guides you through identifying dynamic fields, implementing a normalizer, and keeping golden snapshots in sync.
How This Skill Works
You identify dynamic fields in the test output, implement a normalizeOutput function that replaces them with placeholders, and apply this normalization to both the actual output and the golden file. Then run tests with an update mechanism (UPDATE_GOLDEN) to regenerate the golden snapshot as needed.
When to Use It
- Implementing tests that compare actual output against saved snapshots
- Tests that fail daily due to changing dates, timestamps, or IDs
- API response tests with dynamic fields
- CLI output tests where file paths or environment details change
- Integration or end-to-end tests that involve non-deterministic data
Quick Start
- Step 1: Identify dynamic fields in the test output
- Step 2: Implement normalizeOutput to redact or replace dynamic values, and apply it to both actual output and golden files
- Step 3: Run tests with UPDATE_GOLDEN=1 to refresh snapshots when appropriate
Best Practices
- Identify all dynamic fields early in the test output (dates, UUIDs, hostnames, etc.)
- Create a robust normalizeOutput function with regex-based replacements for common patterns
- Apply normalization to both the actual test output and the golden file
- Use an explicit update flag (UPDATE_GOLDEN) to refresh snapshots on purpose
- Document the update command and keep golden files readable and maintainable
Example Use Cases
- API tests where IDs and timestamps vary between runs, requiring redaction before snapshot comparison
- CLI tests producing absolute file paths that differ across machines
- Tests that report durations or latency that can drift with load or environment
- End-to-end tests with random tokens in JSON payloads
- CI pipelines that refresh golden files with UPDATE_GOLDEN=1 after validating changes