Get the FREE Ultimate OpenClaw Setup Guide →

shipyard-executing-plans

Scanned
npx machina-cli add skill lgbarn/shipyard/shipyard-executing-plans --openclaw
Files (1)
SKILL.md
12.5 KB
<!-- TOKEN BUDGET: 300 lines / ~900 tokens -->

Executing Plans

<activation>

When This Skill Activates

  • You have a written implementation plan (from shipyard:shipyard-writing-plans or similar)
  • Plan contains independent tasks suitable for agent dispatch or batch execution
  • You need structured execution with two-stage review gates

Announce at start: "I'm using the executing-plans skill to implement this plan."

Natural Language Triggers

  • "build this", "implement this", "build me this", "execute the plan", "run the plan"
</activation>

Overview

Execute implementation plans by dispatching fresh builder agents per task, with two-stage review after each: spec compliance review first, then code quality review. Can also run as batch execution with human checkpoints.

Core principle: Fresh agent per task + two-stage review (spec then quality) = high quality, fast iteration.

digraph when_to_use {
    "Have implementation plan?" [shape=diamond];
    "Tasks mostly independent?" [shape=diamond];
    "Stay in this session?" [shape=diamond];
    "Agent-driven execution" [shape=box];
    "Batch execution with checkpoints" [shape=box];
    "Manual execution or brainstorm first" [shape=box];

    "Have implementation plan?" -> "Tasks mostly independent?" [label="yes"];
    "Have implementation plan?" -> "Manual execution or brainstorm first" [label="no"];
    "Tasks mostly independent?" -> "Stay in this session?" [label="yes"];
    "Tasks mostly independent?" -> "Manual execution or brainstorm first" [label="no - tightly coupled"];
    "Stay in this session?" -> "Agent-driven execution" [label="yes"];
    "Stay in this session?" -> "Batch execution with checkpoints" [label="no - parallel session"];
}
<instructions>

The Process

Step 1: Load and Review Plan

  1. Read plan file
  2. Review critically - identify any questions or concerns about the plan
  3. If concerns: Raise them with your human partner before starting
  4. If no concerns: Create tasks via TaskCreate and proceed

Step 2: Execute Tasks

Agent-Driven Mode (preferred):

For each task, dispatch a fresh builder agent:

digraph process {
    rankdir=TB;

    subgraph cluster_per_task {
        label="Per Task";
        "Dispatch builder agent" [shape=box];
        "Builder asks questions?" [shape=diamond];
        "Answer questions, provide context" [shape=box];
        "Builder implements, tests, commits, self-reviews" [shape=box];
        "Dispatch spec reviewer agent" [shape=box];
        "Spec reviewer confirms code matches spec?" [shape=diamond];
        "Builder fixes spec gaps" [shape=box];
        "Dispatch code quality reviewer agent" [shape=box];
        "Code quality reviewer approves?" [shape=diamond];
        "Builder fixes quality issues" [shape=box];
        "Mark task complete" [shape=box];
    }

    "Read plan, extract all tasks, create via TaskCreate" [shape=box];
    "More tasks remain?" [shape=diamond];
    "Dispatch final reviewer for entire implementation" [shape=box];
    "Use shipyard:git-workflow to complete" [shape=box style=filled fillcolor=lightgreen];

    "Read plan, extract all tasks, create via TaskCreate" -> "Dispatch builder agent";
    "Dispatch builder agent" -> "Builder asks questions?";
    "Builder asks questions?" -> "Answer questions, provide context" [label="yes"];
    "Answer questions, provide context" -> "Dispatch builder agent";
    "Builder asks questions?" -> "Builder implements, tests, commits, self-reviews" [label="no"];
    "Builder implements, tests, commits, self-reviews" -> "Dispatch spec reviewer agent";
    "Dispatch spec reviewer agent" -> "Spec reviewer confirms code matches spec?";
    "Spec reviewer confirms code matches spec?" -> "Builder fixes spec gaps" [label="no"];
    "Builder fixes spec gaps" -> "Dispatch spec reviewer agent" [label="re-review"];
    "Spec reviewer confirms code matches spec?" -> "Dispatch code quality reviewer agent" [label="yes"];
    "Dispatch code quality reviewer agent" -> "Code quality reviewer approves?";
    "Code quality reviewer approves?" -> "Builder fixes quality issues" [label="no"];
    "Builder fixes quality issues" -> "Dispatch code quality reviewer agent" [label="re-review"];
    "Code quality reviewer approves?" -> "Mark task complete" [label="yes"];
    "Mark task complete" -> "More tasks remain?";
    "More tasks remain?" -> "Dispatch builder agent" [label="yes"];
    "More tasks remain?" -> "Dispatch final reviewer for entire implementation" [label="no"];
    "Dispatch final reviewer for entire implementation" -> "Dispatch auditor for security review";
    "Dispatch auditor for security review" -> "Critical security findings?" [shape=diamond];
    "Critical security findings?" -> "Builder fixes security issues" [label="yes"];
    "Builder fixes security issues" -> "Dispatch auditor for security review" [label="re-audit"];
    "Critical security findings?" -> "Dispatch simplifier for complexity review" [label="no / user defers"];
    "Dispatch simplifier for complexity review" -> "High priority simplifications?" [shape=diamond];
    "High priority simplifications?" -> "Builder implements simplifications" [label="user chooses fix"];
    "Builder implements simplifications" -> "Dispatch simplifier for complexity review" [label="re-check"];
    "High priority simplifications?" -> "Use shipyard:git-workflow to complete" [label="no / user defers"];
}

Batch Mode (separate session):

Default: First 3 tasks per batch.

For each task:

  1. Mark as in_progress
  2. Follow each step exactly (plan has bite-sized steps)
  3. Run verifications as specified
  4. Mark as completed

When batch complete:

  • Show what was implemented
  • Show verification output
  • Say: "Ready for feedback."

Based on feedback:

  • Apply changes if needed
  • Execute next batch
  • Repeat until complete

Two-Stage Review Pattern

Stage 1: Spec Compliance Review

  • Does the code match the plan's specification?
  • Are all requirements met?
  • Is there anything extra that wasn't requested?
  • Are verification criteria satisfied?

Stage 2: Code Quality Review

  • Is the code well-structured?
  • Are there any bugs or edge cases missed?
  • Is naming clear and consistent?
  • Are tests comprehensive?

IMPORTANT: Always complete spec compliance before code quality. Wrong order wastes time reviewing quality of code that doesn't meet spec.

Step 3: Post-Completion Quality Gates

After the final reviewer approves the entire implementation, run these quality gates:

Security Audit

Dispatch an auditor agent (subagent_type: "shipyard:auditor") with:

  • Git diff of all files changed during plan execution
  • All task summaries and context
  • Dependency manifests if any dependencies were added/changed
  • Working directory, current branch, and worktree status
  • Follow Model Routing Protocol — resolve model from model_routing.security_audit (default: sonnet). See docs/PROTOCOLS.md for details.

If CRITICAL findings exist:

  1. Display the critical findings to the user
  2. User decides: fix now (dispatch builder with audit feedback) / defer (append to ISSUES.md) / acknowledge and proceed
  3. If fixing, re-run audit after fixes

Simplification Review

After the audit, dispatch a simplifier agent (subagent_type: "shipyard:simplifier") with:

  • Git diff of all files changed during plan execution
  • All task summaries
  • Working directory, current branch, and worktree status
  • Follow Model Routing Protocol — resolve model from model_routing.simplification (default: sonnet). See docs/PROTOCOLS.md for details.

Present findings with options:

  1. Implement simplifications — dispatch builder with simplification plan
  2. Defer — append to ISSUES.md for future cleanup
  3. Dismiss — acknowledge and proceed

Step 4: Complete Development

After quality gates pass:

  • Announce: "I'm using the git-workflow skill to complete this work."
  • REQUIRED SUB-SKILL: Use shipyard:git-workflow
  • Follow that skill to verify tests, present options, execute choice

Teammate Mode

This section applies when running in a Claude Code Agent Teams context.

As Team Lead (dispatch_mode is team)

When Shipyard created the team via /shipyard:build team mode:

  • Orchestrate teammates via TeamCreate → TaskCreate (pre-assign) → Task(team_name) → TaskList (monitor)
  • Handle shutdown/cleanup via SendMessage(shutdown_request) + TeamDelete
  • Quality gates remain with lead — auditor, simplifier, documenter are dispatched as single-agent Task calls by the lead, not delegated to teammates
  • Monitor progress via TaskList polling until all tasks reach terminal state
  • Cleanup is mandatory — always run shutdown + delete even on errors

As Team Member (SHIPYARD_IS_TEAMMATE=true)

When Shipyard is running inside someone else's team:

  • Execute tasks directly instead of dispatching builder subagents (you ARE the builder)
  • Skip quality gate dispatch (auditor, simplifier) — the lead agent handles these
  • Write results to task metadata instead of STATE.json — the lead reads task list for progress
  • Respect TeammateIdle hook — ensure tests pass before stopping work

In solo mode (neither team-lead nor team-member), this section has no effect — standard subagent dispatch applies.

</instructions> <examples>

Example: Good vs Bad Execution

<example type="good" title="Proper two-stage review execution"> Task 3: Add retry logic to API client
  1. Dispatch builder agent with full task context from plan
  2. Builder implements, writes tests, commits, self-reviews
  3. Dispatch spec reviewer:
    • "Plan says: retry 3 times with exponential backoff. Code retries 3 times but uses fixed delay."
    • FAIL -- send back to builder
  4. Builder fixes to exponential backoff, re-commits
  5. Dispatch spec reviewer again:
    • "All spec requirements met." PASS
  6. Dispatch code quality reviewer:
    • "Code is clean, tests comprehensive." PASS
  7. Mark task complete, move to Task 4 </example>
<example type="bad" title="Skipping review stages"> Task 3: Add retry logic to API client
  1. Builder implements and commits
  2. "Looks good to me" -- skip spec review, jump to next task
  3. Task 5 fails because Task 3 used fixed delay instead of exponential backoff
  4. Now must revisit Task 3, causing cascading rework </example>
</examples>

When to Stop and Ask for Help

STOP executing immediately when:

  • Hit a blocker mid-batch (missing dependency, test fails, instruction unclear)
  • Plan has critical gaps preventing starting
  • You don't understand an instruction
  • Verification fails repeatedly

Ask for clarification rather than guessing.

<rules>

Builder Agent Guidelines

Builder agents should:

  • Follow TDD naturally (shipyard:shipyard-tdd)
  • Ask questions before AND during work if unclear
  • Self-review before handing off to reviewers
  • Commit after each task

Red Flags

Never:

  • Skip reviews (spec compliance OR code quality)
  • Proceed with unfixed issues
  • Dispatch multiple builder agents in parallel (conflicts)
  • Make agent read plan file (provide full text instead)
  • Skip scene-setting context (agent needs to understand where task fits)
  • Ignore agent questions (answer before letting them proceed)
  • Accept "close enough" on spec compliance
  • Skip review loops (reviewer found issues = builder fixes = review again)
  • Let builder self-review replace actual review (both are needed)
  • Start code quality review before spec compliance is approved (wrong order)
  • Move to next task while either review has open issues

If builder asks questions:

  • Answer clearly and completely
  • Provide additional context if needed
  • Don't rush them into implementation

If reviewer finds issues:

  • Builder (same agent) fixes them
  • Reviewer reviews again
  • Repeat until approved
  • Don't skip the re-review

If agent fails task:

  • Dispatch fix agent with specific instructions
  • Don't try to fix manually (context pollution)
</rules>

Integration

Required workflow skills:

  • shipyard:shipyard-writing-plans - Creates the plan this skill executes
  • shipyard:git-workflow - Complete development after all tasks

Agents should use:

  • shipyard:shipyard-tdd - Agents follow TDD for each task

Remember

  • Review plan critically first
  • Follow plan steps exactly
  • Don't skip verifications
  • Reference skills when plan says to
  • Between batches: just report and wait
  • Stop when blocked, don't guess

Source

git clone https://github.com/lgbarn/shipyard/blob/main/skills/shipyard-executing-plans/SKILL.mdView on GitHub

Overview

Executing plans dispatches a fresh builder agent for each task and applies two-stage reviews: first a spec compliance check, then a code quality review. It supports both per-task execution and batch execution with checkpoints for human oversight.

How This Skill Works

The skill loads the plan, converts it into discrete tasks using TaskCreate, and dispatches a new builder agent for each task. Each task goes through a spec review to verify alignment with the plan, then a code quality review before marking completion. If needed, you can run in batch mode with human checkpoints.

When to Use It

  • You have a written implementation plan and most tasks are independent.
  • You want to dispatch a fresh builder agent per task with a two-stage review (spec then quality).
  • You need either per-task execution or batch execution with checkpoints for human oversight.
  • You are in a workflow that uses builder/reviewer agents or a separate review session.
  • You want to validate plan alignment and ensure quality before finalizing tasks.

Quick Start

  1. Step 1: Load the plan file and review for questions or concerns.
  2. Step 2: Create tasks with TaskCreate and dispatch a fresh builder agent for each task.
  3. Step 3: Run the two-stage review after each task (spec then quality) and mark completion.

Best Practices

  • Load the plan and raise any questions or concerns with a human partner before starting.
  • Break the plan into independent tasks using TaskCreate for clear dispatch boundaries.
  • Always dispatch a fresh builder agent per task to avoid cross-task leakage.
  • Enforce two-stage reviews after each task: spec compliance first, then code quality.
  • Prefer per-task execution, but switch to batch mode with checkpoints when parallel sessions are required.

Example Use Cases

  • Execute a feature rollout by turning a written plan into independent tasks and validating each via spec and quality reviews.
  • Refactor a module by assigning per-task builders and staging code through spec and quality gates.
  • Run a multi-task plan in parallel with batch execution and human checkpoints to monitor progress.
  • Handle tightly coupled tasks by conducting an upfront brainstorming pass before dispatching tasks.
  • Complete a plan with a final review gate after all tasks are done, ensuring overall alignment.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers