What is a closed loop in this skill and why is it important?

A closed loop means writing a test, implementing, running tests, and refactoring if needed to verify behavior end-to-end, preventing open-ended work.

How do I manage long-running tasks across sessions?

Document state in claude-progress.txt, use incremental increments with checkpoints, and leverage TaskCreate/TaskUpdate to persist progress and auto-unblock when dependencies finish.

How do subagents help with context limits?

Subagents operate with isolated contexts and return only relevant results to the orchestrator, reducing context fragmentation and enabling true parallelism; use /compact for manual context consolidation when needed.

autonomous-longtask-v2

Scanned

npx machina-cli add skill T-0-co/t-0-spec-kit-ralph/autonomous-longtask-v2 --openclaw

Files (1)

SKILL.md

8.7 KB

Autonomous Long-Task Development

This skill optimizes Claude Code for long, autonomous development tasks—from multi-hour feature implementations to multi-session refactorings.

Core Principles

1. Loop Closing (Test-Driven)

Every code change must close a feedback loop:

1. Write test (define expected behavior)
2. Implement code
3. Run test → loop closed
4. Refactor if needed
5. Re-run test → confidence

Anti-pattern (Open Loop):

"Implement feature X"
→ Code written, no verification
→ Bugs discovered later

Best Practice (Closed Loop):

"Implement feature X with tests.
Run tests after implementation."
→ Immediate verification

2. Incremental Progress with Checkpoints

Never attempt to "one-shot" complex features:

❌ Implement entire feature at once
   → Context runs out mid-implementation
   → Next session inherits chaos

✅ Small, tested increments
   → Each increment works standalone
   → Clean handoff between sessions

Checkpoint patterns:

Use /rewind or Esc Esc for rollbacks
Commit after each working increment
Document state in claude-progress.txt for session handoffs

3. Context Management

Claude has 200K tokens, but:

Subagents have isolated context windows
Long sessions fragment context
Use claude-progress.txt for session handoffs
/compact <focus> for manual compaction

Task System

Use TaskCreate/TaskUpdate for persistent progress tracking (replaces deprecated TodoWrite).

Quick Start

1. TaskCreate: Create task with subject, description, activeForm
2. TaskUpdate: Set dependencies with addBlockedBy
3. TaskUpdate: Mark in_progress when starting
4. TaskUpdate: Mark completed when done
5. TaskList: Find next available work

Task Workflow

pending → in_progress → completed
         ↑
    (start work)

Tasks persist across sessions and auto-unblock when dependencies complete.

For detailed patterns: See references/task-patterns.md

Subagents & Parallelization

Subagents are lightweight Claude instances with isolated context. Only relevant results return to the orchestrator.

Available Agent Types

Agent	Purpose	Tools
`general-purpose`	Complex multi-step tasks	All
`Explore`	Codebase exploration, pattern search	Glob, Grep, Read (no Edit/Write)
`Plan`	Implementation planning	Glob, Grep, Read (no Edit/Write)
`Bash`	Git operations, command execution	Bash only

When to Use Subagents

DO:

Explore unfamiliar codebase areas
Search patterns across many files
Parallelize independent tasks (max 10 parallel)
Isolate high-volume operations (tests, logs)

DON'T:

Read specific known file → Use Read directly
Search in 2-3 files → Use Read directly
Find class definition → Use Glob directly

Parallel Patterns

Background subagents:

"Implement Stripe integration parallel:

Subagent 1 (Backend): Create API endpoint
Subagent 2 (Frontend): Payment form component
Subagent 3 (Tests): Integration tests

Start all three with Task tool (run_in_background: true).
Wait for completion, then integrate."

Git worktrees for true parallelism:

git worktree add ../feature-a feature/a
git worktree add ../feature-b feature/b
# Separate Claude sessions in each worktree

Custom Subagents

Create custom agents in .claude/agents/:

long-task-coordinator.md - orchestrates multi-step work
test-runner.md - runs and fixes tests
code-reviewer.md - reviews before commit

Session Management

Naming Sessions

Use /rename to give sessions descriptive names:

/rename stripe-integration

Resuming Work

claude --continue    # Resume most recent session
claude --resume      # Session picker
claude --from-pr 123 # Resume PR-linked session

Session Picker Shortcuts

Key	Action
`↑/↓`	Navigate sessions
`P`	Preview session
`R`	Rename session
`B`	Filter by branch
`/`	Search

Context Management

Auto-compaction triggers at ~95% capacity
Configure earlier: CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50
Manual: /compact Focus on API changes
Clear between unrelated tasks: /clear

For detailed patterns: See references/session-management.md

Template Prompts

Long-Running Feature Implementation

"Implement [FEATURE] with complete loop-closing.

Requirements:
- [Req 1]
- [Req 2]

Approach:
1. Create tasks with TaskCreate for each major step
2. Set dependencies with TaskUpdate (addBlockedBy)
3. Work through tasks: test → implement → verify

Constraints:
- Each increment must be testable
- Commit after each working step
- Update claude-progress.txt if context fills

Token-Budget: Use full 200K, do NOT stop early.
At context limit: Update claude-progress.txt for handoff.
Parallelize independent tasks with subagents."

Multi-Session Refactoring

"Refactoring Part [N] of [TOTAL].

Previous Sessions:
- See claude-progress.txt
- Git log for prior changes

This Session Goals:
- [Goal 1]
- [Goal 2]

At session end:
- Update claude-progress.txt
- All tests must pass
- Document clear next steps

If context runs low:
- Finish cleanly (no half implementations)
- Write handoff document
- Next session continues seamlessly"

Bug Investigation

"Investigate and fix bug: [DESCRIPTION]

Observed behavior:
- [What happens]
- [When it happens]
- [Error messages]

Investigation Steps:
1. Reproduce locally
2. Check logs
3. Identify root cause
4. Implement fix
5. Add regression test (loop closing!)
6. Verify

Document findings in claude-progress.txt if complex."

Autonomy Levels

High Autonomy (Default for Long Tasks)

"Implement feature X.
Make all implementation decisions based on best practices.
Do NOT stop due to token budget.
Create tests and docs as you work."

Guided Autonomy (For Critical Decisions)

"Implement feature X.
Ask me BEFORE decisions about:
- Database schema changes
- External API selection
- Breaking changes

For everything else: Proceed autonomously."

Common Failure Modes

Problem	Solution
Claude stops too early	Say "Do NOT stop due to token budget" explicitly
Too many clarifying questions	More context upfront, set autonomy level
Code style mismatch	"Follow pattern in [FILE]"
Task too complex	Break into increments, use TaskCreate
Context runs out	claude-progress.txt + clean commits
One-shotting fails	Explicitly require incremental approach
Lost progress between sessions	Use Tasks system (persists), not mental tracking

Verification Patterns

Test Pyramid

          /\
         /E2E\ ← Playwright (critical flows)
        /-----\
       / API   \ ← Integration tests
      /---------\
     /   Unit    \ ← Fastest feedback
    /--------------\

Loop-Closing Checklist

Test exists for the change
Test ran after implementation
Test passed (or failure was addressed)
Commit includes both code and test

For detailed patterns: See references/verification-patterns.md

Token & Cost Considerations

Single agent: ~50K-100K tokens per session
Parallel agents: 3-4x higher token consumption
Trade-off: Higher velocity vs. higher cost

Recommendations:

Quick tasks: Single agent
Complex/long tasks: Subagents justify cost
Claude Max Plan for heavy-duty usage

Quick Reference

START:
1. TaskCreate with task breakdown
2. Define autonomy level
3. Establish loop-closing pattern

DURING:
- Test each increment
- Commit after working steps
- Subagents for parallel work
- TaskUpdate status as you progress

END / HANDOFF:
- Update claude-progress.txt
- All tests green
- Clear next steps documented
- TaskUpdate: mark completed

Sources

Based on:

Source

git clone https://github.com/T-0-co/t-0-spec-kit-ralph/blob/main/.agents/skills/autonomous-longtask-v2/SKILL.mdView on GitHub

Overview

This skill optimizes Claude Code for long, autonomous development tasks, enabling multi-session work, subagents, parallelization, and loop-closing. It is ideal for complex features across multiple files or services, workflows with many dependent steps, and long-running tasks that exceed context limits.

How This Skill Works

Work is broken into small, verifiable increments with closed feedback loops. Use TaskCreate/TaskUpdate to track progress, dependencies, and status across sessions, and document state in claude-progress.txt for handoffs. Leverage subagents and parallelization to accelerate work, and rely on context-management practices like /compact and session handoffs to avoid context loss.

When to Use It

Implementing a multi-file feature across services with many steps
Refactoring or migrating code across sessions that span context limits
Long-running tasks that exceed 30 minutes and require checkpoints
Work requiring parallel exploration and execution by subagents
Complex features involving multiple files/services with dependency-driven progress

Quick Start

Step 1: TaskCreate: Create task with subject, description, activeForm
Step 2: TaskUpdate: Set dependencies with addBlockedBy; mark in_progress when starting
Step 3: TaskList: Upon completion, mark finished and fetch the next available work

Best Practices

Always close feedback loops by writing tests first, then implementing and re-running tests
Break work into small, testable increments and commit after each working increment
Document current state and handoffs in claude-progress.txt for session continuity
Use TaskCreate/TaskUpdate and TaskList to manage persistent work and auto-unblock on dependencies
Use subagents and parallelization for independent tasks; leverage context-management tools like /compact and worktrees

Example Use Cases

Implement Stripe integration by delegating backend, frontend, and tests to parallel subagents; synchronize results after all subagents finish
Refactor a multi-file feature across services with incremental commits and claude-progress.txt handoffs
Add a new authentication flow by first writing tests, then implementing code in increments with automated reruns
Execute a long-running data migration across modules with checkpoints and dependency-driven progression
Explore a large codebase with Read/Glob subagents to surface patterns, then consolidate findings into a cohesive plan

Frequently Asked Questions

Add this skill to your agents