test
npx machina-cli add skill parcadei/Continuous-Claude-v3/test --openclaw/test - Testing Workflow
Run comprehensive test suite with parallel execution.
When to Use
- "Run all tests"
- "Test the feature"
- "Verify everything works"
- "Full test suite"
- Before releases or merges
- After major changes
Workflow Overview
┌─────────────┐ ┌───────────┐
│ diagnostics │ ──▶ │ arbiter │ ─┐
│ (type check)│ │ (unit) │ │
└─────────────┘ └───────────┘ │
├──▶ ┌─────────┐
┌───────────┐ │ │ atlas │
│ arbiter │ ─┘ │ (e2e) │
│ (integ) │ └─────────┘
└───────────┘
Pre-flight Parallel Sequential
(~1 second) fast tests slow tests
Agent Sequence
| # | Agent | Role | Execution |
|---|---|---|---|
| 1 | arbiter | Unit tests, type checks, linting | Parallel |
| 1 | arbiter | Integration tests | Parallel |
| 2 | atlas | E2E/acceptance tests | After 1 passes |
Why This Order?
- Fast feedback: Unit tests fail fast
- Parallel efficiency: No dependency between unit and integration
- E2E gating: Only run slow E2E tests if faster tests pass
Execution
Phase 0: Pre-flight Diagnostics (NEW)
Before running tests, check for type errors - they often cause test failures:
tldr diagnostics . --project --format text 2>/dev/null | grep "^E " | head -10
Why diagnostics first?
- Type check is instant (~1s), tests take longer
- Diagnostics show ROOT CAUSE, tests show symptoms
- "Expected int, got str" is clearer than "AttributeError at line 50"
- Catches errors in untested code paths
If errors found: Fix them BEFORE running tests. Type errors usually mean tests will fail anyway.
If clean: Proceed to Phase 1.
Phase 0.5: Change Impact (Optional)
For large test suites, find only affected tests:
tldr change-impact --session
# or for explicit files:
tldr change-impact src/changed_file.py
This returns which tests to run based on what changed. Skip this for small projects or when you want full coverage.
Phase 1: Parallel Tests
# Run both in parallel
Task(
subagent_type="arbiter",
prompt="""
Run unit tests for: [SCOPE]
Include:
- Unit tests
- Type checking
- Linting
Report: Pass/fail count, failures detail
""",
run_in_background=true
)
Task(
subagent_type="arbiter",
prompt="""
Run integration tests for: [SCOPE]
Include:
- Integration tests
- API tests
- Database tests
Report: Pass/fail count, failures detail
""",
run_in_background=true
)
# Wait for both
[Check TaskOutput for both]
Phase 2: E2E Tests (If Phase 1 Passes)
Task(
subagent_type="atlas",
prompt="""
Run E2E tests for: [SCOPE]
Include:
- End-to-end flows
- Acceptance tests
- UI tests if applicable
Report: Pass/fail count, screenshots on failure
"""
)
Test Scopes
Full Suite
User: /test
→ All unit + integration + E2E tests
Feature Scope
User: /test authentication
→ Only auth-related tests
Quick Check
User: /test --quick
→ Only unit tests (skip integration and E2E)
Example
User: /test the new payment feature
Claude: Starting /test workflow for payment feature...
Phase 0: Pre-flight diagnostics...
$ tldr diagnostics . --project --format text | grep "^E "
(no type errors found)
Phase 1: Running parallel tests...
┌─────────────────────────────────────────┐
│ arbiter: Running unit tests... │
│ arbiter: Running integration tests... │
└─────────────────────────────────────────┘
arbiter: ✅ 45/45 unit tests passing
arbiter: ✅ 12/12 integration tests passing
Phase 2: Running E2E tests...
atlas: ✅ 8/8 E2E tests passing
Test Summary:
┌─────────────┬─────────┬────────┐
│ Type │ Passed │ Failed │
├─────────────┼─────────┼────────┤
│ Unit │ 45 │ 0 │
│ Integration │ 12 │ 0 │
│ E2E │ 8 │ 0 │
├─────────────┼─────────┼────────┤
│ TOTAL │ 65 │ 0 │
└─────────────┴─────────┴────────┘
All tests passing! ✅
Example with Type Errors
User: /test
Claude: Starting /test workflow...
Phase 0: Pre-flight diagnostics...
$ tldr diagnostics . --project --format text | grep "^E "
E src/payment.py:45:12: Argument of type 'str' not assignable to 'int'
E src/refund.py:23:8: Return type 'None' not assignable to 'float'
Found 2 type errors. Fixing before running tests...
[Claude fixes the type errors]
Re-running diagnostics... clean.
Phase 1: Running parallel tests...
Failure Handling
If Phase 1 fails:
arbiter: ❌ 43/45 tests passing
2 failures:
- test_payment_validation: expected 'invalid' got 'valid'
- test_refund_calculation: off by $0.01
Stopping workflow. Fix failures before running E2E tests.
Flags
--quick: Unit tests only--no-e2e: Skip E2E tests--coverage: Include coverage report--watch: Re-run on file changes
Source
git clone https://github.com/parcadei/Continuous-Claude-v3/blob/main/.claude/skills/test/SKILL.mdView on GitHub Overview
Orchestrates a full test run from pre-flight diagnostics through E2E testing. It prioritizes fast feedback with parallel unit and integration tests, then gates slower E2E tests behind them, helping teams release with confidence.
How This Skill Works
Diagnostics run first (Phase 0) to surface type errors. The arbiter executes unit tests, type checks, and linting in parallel with integration tests; atlas runs E2E tests after Phase 1 passes. Optional Phase 0.5 can compute change impact to narrow the test scope.
When to Use It
- Run all tests
- Test the feature
- Verify everything works
- Full test suite
- Before releases or merges
Quick Start
- Step 1: Run tests with a chosen scope, e.g., /test or /test --quick
- Step 2: Monitor Phase 0 (Diagnostics) and Phase 1 outputs for fast feedback
- Step 3: If Phase 1 passes, trigger Phase 2 (E2E) and review the final summary
Best Practices
- Run pre-flight diagnostics first to catch type errors early
- Use Phase 0.5 change impact for large repos to limit tests
- Run unit tests, type checks, and linting in parallel with integration tests
- Gate E2E tests behind the success of fast tests (Phase 1)
- Choose test scopes (Full, Feature, Quick) to balance coverage and speed
Example Use Cases
- Release readiness: /test with full suite before a release
- Feature validation: /test authentication to focus on auth tests
- Quick health check: /test --quick to skip integration and E2E
- Refactor fallout: run Phase 0 diagnostics and Phase 1 to see impacted tests
- E2E validation: after unit/integration pass, run atlas E2E tests