loop-execution-evaluator
Scannednpx machina-cli add skill Ibrahim-3d/conductor-orchestrator-superpowers/loop-execution-evaluator --openclawLoop Execution Evaluator — Step 4: Dispatcher
This agent does NOT evaluate directly. It determines the track type and dispatches the correct specialized evaluator.
Why Specialized Evaluators?
Different track types need fundamentally different checks:
- A UI track needs design system adherence, visual consistency, responsive checks
- A feature track needs build integrity, type safety, code patterns
- An integration track needs API contracts, auth flows, error recovery
- A business logic track needs product rules, edge cases, state transitions
A generic checklist misses critical issues specific to each type.
Dispatch Logic
Read the track's metadata.json and spec.md to determine the track type, then dispatch:
| Track Type | Keywords in spec/metadata | Evaluator |
|---|---|---|
| UI / Design | "screen", "component", "design system", "layout", "visual", "UI shell" | eval-ui-ux |
| Feature / Code | "implement", "feature", "refactor", "infrastructure", "hook", "store" | eval-code-quality |
| Integration | "Supabase", "Stripe", "Gemini", "API", "auth", "database", "webhook" | eval-integration |
| Business Logic | "generation", "lock", "dependency", "pricing", "tier", "pipeline", "download" | eval-business-logic |
Multi-Type Tracks
Some tracks need multiple evaluators. For example:
- A generator logic track →
eval-business-logic+eval-code-quality - An auth/DB integration track →
eval-integration+eval-code-quality - A UI shell track →
eval-ui-uxonly
When multiple evaluators apply, run them all. The track passes only if ALL evaluators pass.
Dispatch Workflow
1. Read track metadata.json + spec.md
2. Determine track type(s)
3. Dispatch evaluator(s):
→ eval-ui-ux (if UI track)
→ eval-code-quality (if code/feature track)
→ eval-integration (if integration track)
→ eval-business-logic (if logic track)
4. Collect results from all dispatched evaluators
5. Aggregate into final verdict
Structural Checks (Always Run)
Regardless of track type, always verify these baseline checks:
| Check | Method |
|---|---|
| plan.md updated | All completed tasks marked [x] with commit SHA and summary |
| Scope alignment | No unplanned work added without documentation |
| No skipped tasks | All [ ] tasks either completed or documented as intentionally deferred |
| Build passes | npm run build exits 0 |
| Business docs in sync | If track made pricing/model/business decisions, verify docs are flagged for Step 5.5 sync |
Business Doc Sync Check
If the track made any business-impacting changes, verify:
- The executor's summary includes
Business Doc Sync Required: Yes - Affected documents are listed
- This flags the Conductor to run Step 5.5 (Business Doc Sync) before marking complete
What counts as business-impacting:
- Pricing tier, price point, or feature list changes
- AI model, SDK, or cost structure changes
- New package or product tier additions
- Asset pipeline changes (add/remove/modify assets)
- Persona, GTM, or revenue assumption changes
See .claude/skills/business-docs-sync/SKILL.md for the full registry.
Aggregated Verdict
## Execution Evaluation Report
**Track**: [track-id]
**Evaluator**: loop-execution-evaluator (dispatcher)
**Date**: [YYYY-MM-DD]
### Evaluators Dispatched
| Evaluator | Reason | Verdict |
|-----------|--------|---------|
| eval-ui-ux | Track builds P0 screens | PASS ✅ / FAIL ❌ |
| eval-code-quality | Track implements features | PASS ✅ / FAIL ❌ |
### Structural Checks
- plan.md updated: YES / NO
- Scope alignment: YES / NO
- Build passes: YES / NO
- Business doc sync needed: YES / NO (if YES, list affected docs)
### Final Verdict: PASS ✅ / FAIL ❌
All evaluators must PASS for the track to pass.
[If FAIL, aggregate all fix actions from all evaluators]
Metadata Checkpoint Updates
The execution evaluator MUST update the track's metadata.json at key points:
On Start
{
"loop_state": {
"current_step": "EVALUATE_EXECUTION",
"step_status": "IN_PROGRESS",
"step_started_at": "[ISO timestamp]",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "IN_PROGRESS",
"started_at": "[ISO timestamp]",
"agent": "loop-execution-evaluator"
}
}
}
}
On PASS
{
"loop_state": {
"current_step": "BUSINESS_SYNC",
"step_status": "NOT_STARTED",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "PASSED",
"completed_at": "[ISO timestamp]",
"verdict": "PASS",
"evaluators_run": [
{ "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
{ "evaluator": "eval-business-logic", "verdict": "PASS", "issues": [] }
],
"business_sync_required": true
},
"BUSINESS_SYNC": {
"status": "NOT_STARTED",
"required": true
}
}
}
}
On FAIL
{
"loop_state": {
"current_step": "FIX",
"step_status": "NOT_STARTED",
"checkpoints": {
"EVALUATE_EXECUTION": {
"status": "FAILED",
"completed_at": "[ISO timestamp]",
"verdict": "FAIL",
"evaluators_run": [
{ "evaluator": "eval-code-quality", "verdict": "PASS", "issues": [] },
{ "evaluator": "eval-business-logic", "verdict": "FAIL", "issues": ["Business rule violation found"] }
],
"failure_items": [
"Fix business rule enforcement in resolver",
"Add test coverage for edge case"
]
},
"FIX": {
"status": "NOT_STARTED",
"cycle": 1
}
}
}
}
Update Protocol
- Read current
metadata.json - Update
loop_state.checkpoints.EVALUATE_EXECUTIONwith results - If PASS + business sync needed: Set
current_steptoBUSINESS_SYNC - If PASS + no sync needed: Set
current_steptoCOMPLETE - If FAIL: Set
current_steptoFIX, incrementfix_cycle_countin loop_state - Write back to
metadata.json
Handoff
- ALL PASS + No Business Doc Sync → Conductor marks track complete (Step 5)
- ALL PASS + Business Doc Sync Needed → Conductor runs Step 5.5 (Business Doc Sync) before marking complete
- ANY FAIL → Conductor dispatches
loop-fixerwith combined fix list
Source
git clone https://github.com/Ibrahim-3d/conductor-orchestrator-superpowers/blob/master/skills/loop-execution-evaluator/SKILL.mdView on GitHub Overview
Loop Execution Evaluator acts as the dispatcher for evaluation steps. It reads the track metadata and spec to determine the track type and invokes the appropriate specialized evaluator (UI/UX, code-quality, integration, or business logic). It does not run a generic checklist, ensuring issues are evaluated in the right domain.
How This Skill Works
It reads metadata.json and spec.md to identify track type(s), maps to the corresponding evaluators, runs all applicable evaluators (if multi-type), and aggregates their verdicts into a final result.
When to Use It
- When a track is ready for evaluation after loop-executor
- During a /phase-review or build-check to route checks by track type
- When metadata/spec indicate a UI, integration, feature/infra, or business-logic track
- When a multi-type track requires multiple evaluators
- When you need a consolidated verdict from all applicable evaluators
Quick Start
- Step 1: Read track metadata.json and spec.md to identify track type(s).
- Step 2: Map track type(s) to evaluators (UI/UX, code-quality, integration, business-logic).
- Step 3: Dispatch all applicable evaluators and aggregate their verdicts into the final report.
Best Practices
- Read metadata.json and spec.md to identify track type(s)
- Dispatch every applicable evaluator; don’t skip multi-type checks
- Run and then aggregate verdicts; require all to pass for multi-type tracks
- Maintain structural baseline checks regardless of track type
- Ensure traceability by recording track type, evaluators dispatched, and final verdict
Example Use Cases
- UI shell track triggers eval-ui-ux only
- Generator logic track triggers eval-business-logic + eval-code-quality
- Auth/DB integration track triggers eval-integration + eval-code-quality
- Feature/infrastructure change triggers eval-code-quality
- Phase-review scenario triggers dispatch after loop-executor during /phase-review