What if multiple verification options exist?

Choose the most specific command that covers the changed code; if uncertain, run the broadest permissible command (full test suite > single file).

What counts as sufficient evidence?

Fresh output showing a successful exit code and no failures, plus explicit results (e.g., 0 failures, passed tests). Do not rely on vague phrases like 'should pass' without output.

Why not rely on past results or reports?

Verification must be fresh every time you claim completion to avoid stale or misleading evidence.

verification-before-completion

Scanned

npx machina-cli add skill CodingCossack/agent-skills-library/verification-before-completion --openclaw

Files (1)

SKILL.md

3.0 KB

Verification Before Completion

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

Core Protocol

Evidence before claims, always. If you haven't run the verification command in this message, you cannot claim it passes.

BEFORE any completion claim:
1. IDENTIFY: What verification command proves this claim?
2. RUN: Execute the FULL command (fresh, complete)
3. READ: Full output, check exit code, count failures
4. VERIFY: Does output confirm the claim?
   - NO → State actual status with evidence
   - YES → State claim WITH evidence
5. ONLY THEN: Make the claim

Command Selection

When multiple verification options exist (mono-repo, multiple suites):

Run the most specific command that covers the changed code
When uncertain, run the broadest command (full test suite > single file)
Lint ≠ build ≠ test — each verifies different claims

Evidence Format

✅ Ran: npm test
   Exit: 0
   Result: 47 passed, 0 failed
   "All tests pass."

❌ "Tests should pass now" (no command output)

Verification Requirements by Claim Type

Claim	Required Evidence	Insufficient
Tests pass	Test output: 0 failures	Previous run, "should pass"
Linter clean	Linter output: 0 errors	Partial check, extrapolation
Build succeeds	Build exit code: 0	Linter passing
Bug fixed	Original symptom test passes	Code changed
Regression test	Red-green cycle verified	Single green
Agent completed	VCS diff shows changes	Agent "success" report
Requirements met	Line-by-line checklist	Tests passing

Red Flags — STOP

Words: "should", "probably", "seems to"
Satisfaction before verification: "Great!", "Perfect!", "Done!"
About to commit/push/PR without verification
Trusting agent success reports
Partial verification
ANY wording implying success without verification output

Rationalization Prevention

Excuse	Response
"Should work now"	Run the verification
"I'm confident"	Confidence ≠ evidence
"Just this once"	No exceptions
"Linter passed"	Linter ≠ build
"Agent said success"	Verify independently
"Partial check enough"	Partial proves nothing

Key Patterns

Tests:

✅ [Run test] → [See: 34/34 pass] → "All tests pass"
❌ "Should pass now"

Regression (TDD):

✅ Write → Run (pass) → Revert fix → Run (MUST FAIL) → Restore → Run (pass)
❌ "Wrote regression test" (no red-green)

Requirements:

✅ Re-read plan → Checklist each item → Report gaps or completion
❌ "Tests pass, phase complete"

Agent delegation:

✅ Agent reports → Check VCS diff → Verify changes → Report actual state
❌ Trust agent report

Source

git clone https://github.com/CodingCossack/agent-skills-library/blob/main/skills/verification-before-completion/SKILL.mdView on GitHub

Overview

Enforces an evidence-first workflow before declaring success. It guides you to identify the verification command, run a fresh full check, read the results, and verify the claim before reporting. This reduces false positives across tests, builds, and fixes by requiring tangible evidence.

How This Skill Works

Follow the Core Protocol: 1) IDENTIFY the verification command that proves the claim, 2) RUN the FULL command (fresh, complete), 3) READ the output and exit status, 4) VERIFY that the output confirms the claim, 5) ONLY THEN make the claim. When multiple options exist, prioritize the most specific command for changed code; if uncertain, use the broadest allowed command.

When to Use It

Before claiming tests pass
Before stating a fix is complete
Before reporting tests or builds to stakeholders
Before commits or PRs to ensure verifiable state
When auditing results across mono-repos or multiple suites

Quick Start

Step 1: IDENTIFY the verification command that proves the claim
Step 2: RUN the FULL command (fresh, complete) and capture output
Step 3: READ and VERIFY the output; ONLY THEN report the claim

Best Practices

Always specify the exact verification command used
Run a fresh, complete run rather than relying on cached results
Check exit codes and full output to confirm the claim
Use the most specific command for changed code; fallback to broadest if needed
Document the evidence output alongside the claim

Example Use Cases

✅ Ran: npm test; Exit: 0; 47 passed, 0 failed; All tests pass.
✅ Lint clean: eslint reports 0 errors.
✅ Build succeeds: exit code 0 with no warnings.
✅ Regression test passes after fix; red-green cycle completed.
✅ Agent reports; VCS diff analyzed; changes verified against claim.

Frequently Asked Questions

Add this skill to your agents