What is the purpose of writing-skills?

To apply Test-Driven Development to process documentation, ensuring skills are validated by failure scenarios before they are used.

What are the core steps in TaskInitialization?

Create 7 tasks with specific Subject and ActiveForm lines, announce the task set, and follow the Execution rules to progress each task in order.

How is success measured?

All tasks must transition through in_progress to completed after verification passes; RED must reveal at least a few concrete failure modes or rationalizations.

writing-skills

Scanned

npx machina-cli add skill wayne930242/Reflexive-Claude-Code/writing-skills --openclaw

Files (1)

SKILL.md

11.7 KB

Writing Skills

Overview

Writing skills IS Test-Driven Development applied to process documentation.

Write baseline test (watch agent fail without skill), write skill addressing failures, verify agent now complies, refactor to close loopholes.

Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.

Violating the letter of the rules is violating the spirit of the rules.

Task Initialization (MANDATORY)

Before ANY action, create task list using TaskCreate:

TaskCreate for EACH task below:
- Subject: "[writing-skills] Task N: <action>"
- ActiveForm: "<doing action>"

Tasks:

Analyze requirements
RED - Run baseline test
GREEN - Write SKILL.md
Add references (if needed)
Validate structure
REFACTOR - Quality review
Test with real usage

Announce: "Created 7 tasks. Starting execution..."

Execution rules:

TaskUpdate status="in_progress" BEFORE starting each task
TaskUpdate status="completed" ONLY after verification passes
If task fails → stay in_progress, diagnose, retry
NEVER skip to next task until current is completed
At end, TaskList to confirm all completed

TDD Mapping for Skills

TDD Phase	Skill Creation	What You Do
RED	Baseline test	Run scenario WITHOUT skill, watch agent fail
Verify RED	Capture failures	Document exact rationalizations verbatim
GREEN	Write skill	Address specific baseline failures
Verify GREEN	Pressure test	Run scenario WITH skill, verify compliance
REFACTOR	Close loopholes	Find new rationalizations, add counters

Task 1: Analyze Requirements

Goal: Understand what skill to create and why.

Questions to answer:

What capability does this skill teach?
What triggers should activate it? (specific phrases, symptoms)
Is this reusable across projects, or project-specific?
Does a similar skill already exist?

Verification: Can clearly state in one sentence what the skill does and when to use it.

External search (optional): If claude-skills-mcp is available:

mcp__claude-skills-mcp__search_skills query="[capability]"

Analyze existing community skills for patterns to adapt.

If MCP unavailable → skip, continue with internal knowledge.

Task 2: RED - Baseline Test

Goal: Run scenario WITHOUT the skill. Watch agent fail. Document failures.

This is "write failing test first" - you MUST see what agents naturally do wrong.

Process:

Create realistic pressure scenario (see Pressure Scenarios below)
Run scenario in fresh context WITHOUT skill loaded
Document agent's choices and rationalizations verbatim
Identify patterns in failures

Verification: Have documented at least 3 specific failure modes or rationalizations.

Pressure Scenarios

Bad scenario (no pressure):

You need to implement a feature. What should you do?

Too academic. Agent just recites best practices.

Good scenario (multiple pressures):

IMPORTANT: This is a real scenario. Choose and act.

You spent 3 hours on this. It's working perfectly.
You manually tested all edge cases. It's 6pm, dinner at 6:30pm.
You just realized you skipped [the practice this skill enforces].

Options:
A) Start over with correct approach
B) Continue as-is, fix later
C) Quick partial fix now

Choose A, B, or C. Be honest.

Pressure types to combine (use 3+):

Pressure	Example
Time	Deadline, end of day
Sunk cost	Hours of work already done
Authority	Senior says skip it
Exhaustion	Already tired

Task 3: GREEN - Write SKILL.md

Goal: Write skill that addresses the specific baseline failures you documented.

Skill Structure

skill-name/
├── SKILL.md           # Required (<300 lines)
├── scripts/           # Optional: executable tools
└── references/        # Optional: detailed docs

Naming Convention

Gerund form (verb + -ing): writing-skills, processing-pdfs

Lowercase, hyphens, numbers only
Max 64 characters
Avoid: helper, utils, anthropic, claude

SKILL.md Format

---
name: skill-name
description: Use when [trigger]. Use when [symptom]. Use when [context].
argument-hint: [optional-hint]
---

Frontmatter fields:

Field	Description
`name`	Becomes `/slash-command`. Lowercase, hyphens only.
`description`	When to use. Claude uses this for auto-invocation.
`argument-hint`	Autocomplete hint, e.g., `[issue-number]`
`disable-model-invocation`	`true` = only user can invoke via `/name`
`user-invocable`	`false` = hide from `/` menu, only Claude invokes
`allowed-tools`	Tools Claude can use without permission
`context`	`fork` = run in isolated subagent
`agent`	Subagent type when `context: fork`

Arguments substitution:

$ARGUMENTS - all arguments from /skill-name arg1 arg2
$ARGUMENTS[0], $ARGUMENTS[1] - by index
$0, $1 - shorthand for $ARGUMENTS[N]

Description rules (CRITICAL):

Start with "Use when..." (triggering conditions ONLY)
Third person (injected into system prompt)
NEVER summarize workflow (causes Claude to skip reading body)
50-500 characters
Include specific triggers, symptoms, contexts

<Good> ```yaml description: Use when creating new skills, modifying existing skills, or improving skill quality. Use when user says "create a skill". ``` Triggering conditions only. </Good> <Bad> ```yaml description: Creates skills using TDD approach with baseline testing, then writing, then validation. ``` Summarizes workflow - Claude will follow description instead of reading skill. </Bad>

Body Structure

# Skill Name

## Overview
Core principle in 1-2 sentences.
"Violating the letter of the rules is violating the spirit of the rules."

## Task Initialization (MANDATORY)
[Task list with TaskCreate]

## Tasks
### Task 1: [Action]
[Instructions + Verification criteria]

### Task N: Quality Review
[Reviewer subagent call]

## Red Flags
[Self-check table]

## Common Rationalizations
[Excuse | Reality table]

## References
[Links to references/]

Verification

Can answer YES to all:

Description starts with "Use when..."
Description does NOT summarize workflow
Body < 300 lines
Has Task Initialization section
Has Red Flags section
Has verification criteria for each task

Task 4: Add References (if needed)

Goal: Move detailed content to references/ for progressive disclosure.

When to extract:

API documentation (100+ lines)
Detailed examples
Edge case handling
Background theory

Keep in SKILL.md:

Core workflow
Task definitions
Red Flags and Rationalizations
Quick reference tables

Verification: SKILL.md is < 300 lines. All detailed content has a reference link.

Task 5: Validate Structure

Goal: Verify skill structure is correct.

python3 scripts/validate_skill.py <path/to/skill>

Manual checklist if script unavailable:

Frontmatter has name and description
Name is gerund form, lowercase, hyphens only
Description starts with "Use when..."
Body < 300 lines
All reference links work

Verification: Validation passes with no errors.

Task 6: REFACTOR - Quality Review

Goal: Have skill reviewed by skill-reviewer subagent.

Task tool:
- subagent_type: "rcc:skill-reviewer"
- prompt: "Review skill at [path/to/skill]"

Outcomes:

Pass → Proceed to Task 7
Needs Fix → Fix issues, re-run reviewer, repeat until Pass
Fail → Major problems, return to Task 3

This is the REFACTOR phase: Close loopholes identified by reviewer.

Verification: skill-reviewer returns "Pass" rating.

Task 7: Test with Real Usage

Goal: Verify skill works in practice.

Process:

Start new Claude Code session (fresh context)
Trigger skill naturally - use words from description, don't mention skill name
Verify skill activates
Verify agent follows instructions correctly
Run original pressure scenario WITH skill - agent should now comply

Verification:

Skill activates when triggered naturally
Agent follows skill instructions
Agent passes pressure scenario that failed in baseline

Red Flags - STOP

These thoughts mean you're rationalizing. STOP and reconsider:

"Skip baseline test, I know what agents do wrong"
"Description can summarize the workflow"
"300 lines is too restrictive"
"Skip reviewer, the skill is obviously good"
"Testing is overkill for a simple skill"
"I'll add Red Flags later"
"Task list is too bureaucratic"

All of these mean: You're about to create a weak skill. Follow the process.

Common Rationalizations

Excuse	Reality
"I know what agents need"	You know what YOU think they need. Baseline reveals actual failures.
"Baseline testing takes too long"	15 min baseline saves hours debugging weak skill.
"Description should explain the skill"	Description = when to load. Body = what to do. Mixing causes skipping.
"More content = better skill"	More content = more to skip. Concise + references = better.
"Red Flags are negative"	Red Flags prevent rationalization. Essential for discipline skills.
"TaskCreate is overhead"	TaskCreate prevents skipping steps. The overhead IS the value.

Persuasion Principles

For discipline-enforcing skills, use authoritative language:

Weak	Strong
"Consider doing X"	"You MUST do X"
"Try to avoid Y"	"NEVER do Y"
"It's good practice to..."	"No exceptions."

Why: LLMs respond to authority language. "MUST" and "NEVER" eliminate rationalization space.

See references/persuasion-principles.md for psychology research.

Flowchart: Skill Creation

digraph skill_creation {
    rankdir=TB;

    start [label="Need new skill", shape=doublecircle];
    analyze [label="Task 1: Analyze\nrequirements", shape=box];
    baseline [label="Task 2: RED\nBaseline test", shape=box, style=filled, fillcolor="#ffcccc"];
    verify_red [label="Failures\ndocumented?", shape=diamond];
    write [label="Task 3: GREEN\nWrite SKILL.md", shape=box, style=filled, fillcolor="#ccffcc"];
    refs [label="Task 4: Add\nreferences", shape=box];
    validate [label="Task 5: Validate\nstructure", shape=box];
    review [label="Task 6: REFACTOR\nQuality review", shape=box, style=filled, fillcolor="#ccccff"];
    review_pass [label="Review\npassed?", shape=diamond];
    test [label="Task 7: Test\nreal usage", shape=box];
    done [label="Skill complete", shape=doublecircle];

    start -> analyze;
    analyze -> baseline;
    baseline -> verify_red;
    verify_red -> write [label="yes"];
    verify_red -> baseline [label="no\nmore scenarios"];
    write -> refs;
    refs -> validate;
    validate -> review;
    review -> review_pass;
    review_pass -> test [label="pass"];
    review_pass -> write [label="fail\nfix issues"];
    test -> done;
}

References

references/spec.md - Frontmatter specification
references/patterns.md - Common skill patterns
references/examples.md - Before/after examples
references/persuasion-principles.md - Authority language research
references/testing-with-subagents.md - Baseline testing methodology

Source

git clone https://github.com/wayne930242/Reflexive-Claude-Code/blob/main/plugins/rcc/skills/writing-skills/SKILL.mdView on GitHub

Overview

Writing skills apply Test-Driven Development to process documentation. Start with a baseline test to fail without the skill, then write the skill to address failures, verify compliance, and refactor to close loopholes. The core principle is that you must observe a failure before the skill to prove its value.

How This Skill Works

Writing Skills implement a mandatory TaskInit using TaskCreate to outline actions and enforce a strict TD D workflow. They follow a TD D mapping (RED, GREEN, Verify, REFACTOR) and use Execution rules that update task status to in_progress before each task and to completed only after verification passes; if a task fails, the process remains in_progress, prompting diagnosis and retry until all tasks are completed.

When to Use It

When creating a new skill
When modifying an existing skill
When improving skill quality or consistency
When you want to extract a reusable pattern across projects
When you need to validate a skill against real usage or edge cases

Quick Start

Step 1: Define the skill scope and trigger phrases, and initialize the 7 TaskCreate tasks with Subject and ActiveForm
Step 2: Run the RED baseline test in a fresh context without the skill to observe failures and collect verbatim rationalizations
Step 3: Implement GREEN by writing/adjusting SKILL.md, verify GREEN, then run REFACTOR and test with real usage

Best Practices

Define a clear baseline failure for the RED phase to reveal gaps
Document all failures verbatim during RED to capture patterns
Write the GREEN skill to address specific baseline failures
Apply REFACTOR to close loopholes and add counters for new edge cases
Follow the TaskInitialization and Execution rules to ensure disciplined progress

Example Use Cases

Create a new 'data-cleaning' skill for input normalization and consistency
Extend an existing 'summarization' skill to handle long documents and maintain key points
Refactor an 'exception-handling' skill to cover additional error types and messages
Adopt TaskCreate-driven workflows for establishing reusable patterns across projects
Test skills with real usage scenarios to surface failures and rationalizations

Frequently Asked Questions

Add this skill to your agents