agent-implementation-skill
Use Cautionnpx machina-cli add skill nestharus/agent-implementation-skill/src --openclawDevelopment Workflow
Single entry point for the full development lifecycle. Read this file, determine what phase you're in or what the user needs, then read the relevant sub-file from this directory.
Paths
Everything lives in this skill folder. WORKFLOW_HOME is: !dirname "$(grep -rl '^name: agent-implementation-skill' ~/.claude/skills/*/SKILL.md .claude/skills/*/SKILL.md 2>/dev/null | head -1)" 2>/dev/null
When dispatching scripts or agents, export WORKFLOW_HOME with the path
above. Scripts also self-locate via dirname as a fallback when invoked
directly.
$WORKFLOW_HOME/
SKILL.md # this file — entry point
implement.md # multi-model implementation pipeline
research.md # exploration → alignment → proposal
rca.md # root cause analysis
evaluate.md # proposal review
baseline.md # constraint extraction
audit.md # concern-based problem decomposition
constraints.md # constraint discovery
models.md # model selection guide
scripts/
workflow.sh # schedule driver ([wait]/[run]/[done]/[fail])
db.sh # SQLite-backed coordination database
scan.sh # Stage 3 coordinator: dispatches agents to explore codespace and build codemap, then per-section file identification
substrate.sh # Stage 3.5 shim: sets PYTHONPATH, runs python -m substrate
substrate/ # Stage 3.5: shared integration substrate discovery (shards → prune → seed) for greenfield/vacuum sections
section-loop.py # strategic section-loop orchestrator: integration proposals, strategic implementation, cross-section communication, global coordination (Stages 4-5 of implement.md)
tools/
extract-docstring-py # extract Python module docstrings
extract-summary-md # extract YAML frontmatter from markdown
README.md # tool interface spec (for Opus to write new tools)
agents/ # agent role definitions (see agents/*.md for full inventory)
templates/
implement-proposal.md # 10-step implementation schedule
research-cycle.md # 7-step research schedule
rca-cycle.md # 6-step RCA schedule
Workspaces live on native filesystem for performance, separate from project:
- Planspace:
~/.claude/workspaces/<task-slug>/— schedule, state, log, artifacts, coordination database - Codespace: project root or worktree — where source code lives
Clean up planspace when workflow is fully complete (rm -rf the workspace dir).
Your Role
BEFORE DOING ANYTHING ELSE: Determine your role in the pipeline,
then read the corresponding file from $WORKFLOW_HOME/agents/. That
file defines your rules. Do not proceed until you have read it.
Phase Detection
Check these in order:
- User explicitly requested an action → Read the matching file
- Test failures need investigation →
rca.md - Proposal exists, not yet evaluated →
evaluate.md - Proposal evaluated, no baseline →
baseline.md - Baseline exists, implementation needed →
implement.md - No proposal exists →
research.md - Something feels wrong about a change →
constraints.md - Need to pick a model →
models.md - Need concern-based problem decomposition →
audit.md
Files
| File | What It Does |
|---|---|
research.md | Exploration → alignment → proposal → refinement |
evaluate.md | Proposal alignment review (Accept / Reject / Push Back) |
baseline.md | Atomize proposal into constraints / patterns / tradeoffs |
implement.md | Multi-model implementation with worktrees + dynamic scheduling |
rca.md | Root cause analysis + architectural fix for test failures |
audit.md | Concern-based problem decomposition + alignment tracing |
constraints.md | Surface implicit constraints, validate design principles |
models.md | Model selection guide for multi-model workflows |
Design Philosophy
These principles govern all pipeline behavior. Violations are alignment failures.
- Alignment over audit — Check directional coherence between adjacent layers ("is it solving the right problem?"), never feature coverage against a checklist ("is it done?"). The system is never done.
- Strategy over brute force — Strategy collapses many waves of problems in one go. Brute force leads to countless cycles. Fewer tokens, fewer cycles, same quality.
- Scripts dispatch, agents decide — Scripts do mechanical coordination (dispatch, check, log). Agents do reasoning (explore, understand, decide). Strategic decisions (grouping, relatedness, signal interpretation) belong to agents, not scripts.
- Heuristic exploration, not exhaustive scanning — Build a routing map (codemap), then use it for targeted investigation. Never catalog every file. The cost of occasionally routing wrong is far less than exhaustive scanning.
- Problems, not features — We decompose problems all the way down, then solve tiny problems. Proposals describe strategies, not implementations. We never do feature coverage because we generate as we go.
- Proposals must solve the same problems — Alternative proposals are valid only if they solve the original problems. An optimization or complexity argument is an excuse. Do not introduce constraints the user did not specify.
- Accuracy over shortcuts — zero risk tolerance — Every shortcut or bypass of the pipeline introduces risk. We do not accept any risk. Agents must follow the full pipeline faithfully: explore before proposing, propose before implementing, align before proceeding. Shortcuts are permitted ONLY when the remaining work is so small that no meaningful risk exists (e.g., a single trivial cleanup after everything else is aligned and verified). "This is simple enough to skip a step" is never valid reasoning — simplicity is not the same as zero risk. When in doubt, follow the pipeline.
Terminology Contract
- "Audit" only ever means alignment against stated problems and constraints — never feature coverage against a checklist.
- "Alignment" is directional coherence between adjacent layers: does the work solve the problem it claims to solve?
- "Feature coverage" is explicitly banned as a verification method. Plans describe problems and strategies, not enumerable features.
The Full Lifecycle
Exploration → Alignment → Proposal → Review → Baseline → Implementation → Verification
(research.md) (evaluate.md) (baseline.md) (implement.md) (rca.md)
Phases iterate: Review may loop back to Research. Implementation may trigger tangent research cycles. Verification may reveal architectural issues requiring RCA.
Artifact Flow
[Raw Idea]
↓
[Exploration Notes] ← research.md Phase A
↓
[Alignment Document] ← research.md Phase B
↓
[Proposal] ← research.md Phase C
↓
[Evaluation Report] ← evaluate.md (iterate if REJECT/PUSH BACK)
↓
[Design Baseline] ← baseline.md (constraints/, patterns/, TRADEOFFS.md)
↓
[Section Files → Integration Proposals → Strategic Implementation → Code] ← implement.md
↓
[Tests → Debug → Constraint Check → Lint → Commit] ← implement.md + rca.md
Workflow Orchestration
For multi-step workflows, use the orchestration system instead of running everything from memory.
Dispatch: All Agents via agents
CRITICAL: All step dispatch goes through agents via Bash.
Never use Claude's Task tool to spawn sub-agents — it causes "sibling"
errors and reliability issues. The agent runner automatically unsets
CLAUDECODE so sibling Claude sessions can launch.
# Sequential dispatch — model directly with prompt file
agents --model <model> --file <planspace>/artifacts/step-N-prompt.md \
> <planspace>/artifacts/step-N-output.md 2>&1
# Agent file dispatch — agent instructions prepended to prompt
agents --agent-file "$WORKFLOW_HOME/agents/exception-handler.md" \
--file <planspace>/artifacts/exception-prompt.md
# Parallel dispatch with db.sh coordination
(agents --model gpt-codex-high --file <prompt-A.md> && \
bash "$WORKFLOW_HOME/scripts/db.sh" send <planspace>/run.db orchestrator "done:block-A") &
(agents --model gpt-codex-high --file <prompt-B.md> && \
bash "$WORKFLOW_HOME/scripts/db.sh" send <planspace>/run.db orchestrator "done:block-B") &
bash "$WORKFLOW_HOME/scripts/db.sh" recv <planspace>/run.db orchestrator
bash "$WORKFLOW_HOME/scripts/db.sh" recv <planspace>/run.db orchestrator
# Codemap exploration dispatch (Opus explores the codespace)
agents --model claude-opus --project <codespace> \
--file <planspace>/artifacts/scan-logs/codemap-prompt.md \
> <planspace>/artifacts/codemap.md 2>&1
Note: The examples above show script-level dispatch — the orchestrator
launching step agents. Nested strategic work within step agents (e.g.,
exploration during integration proposals) uses task submission: agents write
structured task-request files, and the dispatcher resolves agent file + model.
See implement.md Stage 4-5 for task submission details.
Schedule Templates
Pre-built schedules in $WORKFLOW_HOME/templates/. Each step specifies its model:
[wait] 1. step-name | model-name -- description (skill-section-reference)
implement-proposal.md— full 10-step implementation pipelineresearch-cycle.md— research → evaluate → propose → refinerca-cycle.md— investigate → plan fix → apply → verify
Stage 3 Codemap Exploration
Stage 3 dispatches agents to explore and understand the codebase:
- An Opus agent explores the codespace — reads files, follows its curiosity, builds understanding.
- The agent writes
<planspace>/artifacts/codemap.mdcapturing what it discovered. - Per-section Opus agents use the codemap to identify related files for each section.
- Deep scan dispatches GLM agents to reason about specific file relevance in context.
Control and recovery:
- If
codemap.mdalready exists, reuse it only if the codespace fingerprint is unchanged or the verifier confirms validity; otherwise rebuild. - If a section already has
## Related Files, validate the list against the current codemap/section content; skip only if unchanged. - Non-zero codemap exit stops Stage 3 before section exploration.
Model Roles
| Model | Used For |
|---|---|
claude-opus | Section setup (excerpt extraction), alignment checks (shape/direction), decomposition, codemap exploration, per-section file identification |
gpt-codex-high | Integration proposals, strategic implementation, coordinated fixes, extraction, investigation, constraint alignment check |
gpt-codex-xhigh | Deep architectural synthesis, proposal drafting |
glm | Test running, verification, quick commands, deep file analysis, semantic impact analysis |
Prompt Files
Step agents receive self-contained prompt files (they cannot read
$WORKFLOW_HOME). The orchestrator builds each prompt from:
- Skill section text — copied verbatim from the referenced skill file
- Planspace path — so the agent can read/write state and artifacts
- Codespace path — so the agent knows where source code lives
- Context — relevant content from
state.md - Output contract — what the agent should return on success/failure
Written to: <planspace>/artifacts/step-N-prompt.md
Workspace Structure
Each workflow gets a planspace at ~/.claude/workspaces/<task-slug>/:
schedule.md— task queue with status markers (copied from template)state.md— current position + accumulated factslog.md— append-only execution logartifacts/— prompt files, output files, working files for stepsartifacts/sections/— section excerpts (proposal + alignment excerpts)artifacts/proposals/— integration proposals per sectionartifacts/snapshots/— post-completion file snapshots per sectionartifacts/notes/— cross-section consequence notesartifacts/coordination/— global coordinator state and fix promptsartifacts/decisions/— accumulated parent decisions per section (from pause/resume)
run.db— coordination database (messages, events, agent registry)constraints/— discovered constraints (promote later)tradeoffs/— discovered tradeoffs (promote later)
Coordination System (db.sh)
SQLite-backed coordination for agent messaging. One run.db per pipeline
run — messages are claimed (not consumed), history is preserved, and the
database file is the complete audit trail.
# Initialize the coordination database (idempotent)
bash "$WORKFLOW_HOME/scripts/db.sh" init <planspace>/run.db
# Send a message to an agent
bash "$WORKFLOW_HOME/scripts/db.sh" send <planspace>/run.db <target> [--from <agent>] "message text"
# Block until a message arrives (agent sleeps, no busy-loop)
bash "$WORKFLOW_HOME/scripts/db.sh" recv <planspace>/run.db <name> [timeout_seconds]
# Check pending count (non-blocking)
bash "$WORKFLOW_HOME/scripts/db.sh" check <planspace>/run.db <name>
# Read all pending messages
bash "$WORKFLOW_HOME/scripts/db.sh" drain <planspace>/run.db <name>
# Agent lifecycle
bash "$WORKFLOW_HOME/scripts/db.sh" register <planspace>/run.db <name> [pid]
bash "$WORKFLOW_HOME/scripts/db.sh" unregister <planspace>/run.db <name>
bash "$WORKFLOW_HOME/scripts/db.sh" agents <planspace>/run.db
bash "$WORKFLOW_HOME/scripts/db.sh" cleanup <planspace>/run.db [name]
# Event logging and querying
bash "$WORKFLOW_HOME/scripts/db.sh" log <planspace>/run.db <kind> [tag] [body] [--agent <name>]
bash "$WORKFLOW_HOME/scripts/db.sh" tail <planspace>/run.db [kind] [--since <id>] [--limit <n>]
bash "$WORKFLOW_HOME/scripts/db.sh" query <planspace>/run.db <kind> [--tag <t>] [--agent <a>] [--since <id>] [--limit <n>]
Key patterns:
- Orchestrator blocks on
recvwaiting for parallel step results - Step agents send
done:<step>:<summary>orfail:<step>:<error>when finished - Section-loop sends
summary:setup:,summary:proposal:,summary:proposal-align:,summary:impl:,summary:impl-align:,status:coordination:messages;completeonly on full success;fail:<num>:coordination_exhausted:<summary>on coordination timeout - Mailbox is required for orchestrator/step coordination boundaries
- Codemap exploration is a single Opus agent that explores the codespace directly
- Agents needing user input send
ask:<step>:<question>, then block on their own mailbox - User or orchestrator can send
abortto any agent to trigger graceful shutdown agentscommand shows who's registered and who's waiting — detect stuck agents
Cross-Cutting Tools
- audit.md — Concern-based problem decomposition + alignment tracing
- constraints.md — Before implementation or when something feels wrong
- models.md — Which external model to use for any given task
Source
git clone https://github.com/nestharus/agent-implementation-skill/blob/main/src/SKILL.mdView on GitHub Overview
Orchestrates a structured, multi-phase software development pipeline across research, evaluation, design baseline, implementation, RCA, and Stage 3 codemap exploration. It coordinates worktrees, dynamic scheduling, and an SQLite-backed agent coordination database to unify work across GPT, GLM, and Claude.
How This Skill Works
A central entry point reads phase-specific files (research.md, evaluate.md, baseline.md, implement.md, rca.md, constraints.md, models.md) and dispatches tasks to agents. It uses a SQLite-backed coordination database to persist state and artifacts, while Stage 3 codemap exploration coordinates across external models (GPT, GLM, Claude) to map code sections and guide implementation.
When to Use It
- Implement features via a structured, multi-phase pipeline with worktrees and SQLite-backed coordination
- Coordinate exploration, evaluation, and implementation across GPT, GLM, and Claude
- Manage dynamic scheduling and cross-model orchestration in software projects
- Perform Stage 3 codemap exploration to identify integration points across codebases
- Apply constraint discovery and RCA within the development workflow to improve reliability
Quick Start
- Step 1: Export WORKFLOW_HOME to the skill directory path (as described in SKILL.md)
- Step 2: Read and confirm your role from $WORKFLOW_HOME/agents/ and follow its rules
- Step 3: Run scripts/workflow.sh to drive the pipeline ([wait]/[run]/[done]/[fail]) and monitor progress
Best Practices
- Clarify phase ownership in each sub-file (research.md, evaluate.md, baseline.md, implement.md, rca.md, etc.)
- Keep Planspace and Codespace clean and synchronized; clean up planspace when the workflow is complete
- Leverage the SQLite-backed coordination database to persist state, tasks, and artifacts
- Follow agent role definitions in agents/*.md to enforce responsibilities
- Automate RCA and constraint discovery early to inform decisions and reduce rework
Example Use Cases
- Launching a feature with cross-model evaluation and codemap-driven implementation
- Coordinating design constraints and model selection across GPT and Claude
- Using the section-loop orchestrator to align sections across multiple codebases
- Working within ~/.claude/workspaces as isolated task plans and logs
- Dispatching Stage 3 codemap exploration to identify hotspots in a legacy codebase