What is the main goal of context-optimizer?

To break down large, monolithic docs into a modular .agent folder that minimizes context window usage while preserving accessibility of core facts, processes, and references.

Which files are created or moved during decomposition?

Files are organized into memory/, workflows/, tasks/, references/, and skills/, with an index and preamble (00_INDEX.md and 00_preamble.md) to guide navigation.

How do I customize the split without breaking content?

Use -l to set the header level, -r for a custom title pattern, and --min to define minimum lines per section; then manually refine the resulting chunks into the canonical .agent structure.

context-optimizer

Scanned

npx machina-cli add skill pablodiegoo/Data-Pro-Skill/context-optimizer --openclaw

Files (1)

SKILL.md

8.6 KB

Context Optimizer

This skill transforms large, monolithic documents into a modular .agent/ folder structure optimized for AI agent context consumption. The goal is to minimize context window usage while maximizing information accessibility.

Quick Reference

Content Type	Destination	Naming Convention	When to Use
Core Rules/Facts	`memory/`	`project_facts.md`, `conventions.md`	Immutable truths, constraints, standards
Processes/How-To	`workflows/`	`deploy.md`, `review.md`	Step-by-step procedures (turbo-enabled)
Tasks/Plans	`tasks/`	`backlog.md`, `sprint.md`	Active work items, implementation plans
Reference Docs	`references/`	`api_docs.md`, `schema.md`	Large docs loaded on-demand
Skills	`skills/`	`<skill-name>/SKILL.md`	Reusable capabilities with scripts

Workflow

Phase 1: Analyze Source Document

Before splitting, understand the document's structure:

# Preview structure without splitting
head -100 <input_file> | grep -E "^#{1,3} "

Identify:

Hierarchical depth: How many heading levels exist?
Content density: Are sections long enough to justify separate files?
Semantic groupings: Which sections belong together?

Phase 2: Decompose with Script

Use the bundled script to split the document:

python3 .agent/skills/context-optimizer/scripts/decompose.py <input_file> -o <output_dir> [options]

Arguments

Argument	Description	Default
`input_file`	Large markdown/text file to split	Required
`-o, --output`	Output directory for chunks	`<input>_split/`
`-l, --level`	Header level to split by (1=`#`, 2=`##`)	`2`
`-r, --regex`	Custom regex pattern (group 1 = title)	Markdown headers
`--min`	Minimum lines per section	`3`

Examples

# Split by ## (default)
python3 decompose.py project_spec.md -o .agent/temp_split

# Split by # (top-level only)
python3 decompose.py large_doc.md -o chunks -l 1

# Custom pattern (e.g., numbered sections)
python3 decompose.py report.md -r "^(\d+\.\s+.+)$" -o sections

Output: Creates numbered files (01_section_name.md, 02_...) plus 00_INDEX.md and 00_preamble.md.

Phase 3: Organize into .agent Structure

After decomposition, manually categorize each chunk:

.agent/
├── memory/                 # Persistent context (always loaded)
│   ├── user_global.md      # User preferences, patterns
│   ├── project_facts.md    # Tech stack, constraints, conventions
│   └── decisions.md        # ADRs, architectural decisions
│
├── workflows/              # Step-by-step procedures
│   ├── deploy.md           # Deployment process
│   ├── review.md           # Code review checklist
│   └── testing.md          # Testing procedures
│
├── tasks/                  # Active work items
│   ├── backlog.md          # Feature backlog
│   ├── current_sprint.md   # Active sprint items
│   └── implementation_plan.md  # Current implementation plan
│
├── references/             # On-demand documentation
│   ├── api_docs.md         # API specifications
│   ├── schema.md           # Database/data schemas
│   └── external_libs.md    # Third-party library docs
│
└── skills/                 # Reusable capabilities
    └── <skill-name>/
        └── SKILL.md

Phase 4: Optimize Each File

For each categorized file, apply these optimizations:

Memory Files (High Priority)

Maximum size: ~500 lines (always loaded)
Format: Bullet points, tables, concise rules
Avoid: Long explanations, examples (move to references)

Workflow Files

Format: Numbered steps with clear actions
Include: // turbo annotations for auto-runnable steps
Structure: Prerequisites → Steps → Verification

Task Files

Format: Checkbox lists ([ ], [/], [x])
Include: Priority, deadlines, dependencies
Update: Mark items as in-progress/done during work

Reference Files

Maximum size: Unlimited (loaded on-demand)
Include: Table of contents for files > 100 lines
Add: Grep patterns in SKILL.md for large files

Phase 5: Cleanup

Remove temporary files and validate structure:

# Remove decomposition output
rm -rf .agent/temp_split

# Validate structure (optional)
find .agent -name "*.md" -exec wc -l {} \; | sort -n

Decision Matrix

Use this matrix to decide where content belongs:

┌─────────────────────────────────────────────────────────────────┐
│                    Is it a PROCESS/HOW-TO?                      │
│                              │                                  │
│              ┌───────────────┴───────────────┐                  │
│              ▼ YES                           ▼ NO               │
│     ┌────────────────┐              ┌────────────────┐          │
│     │   workflows/   │              │ Is it ACTIVE   │          │
│     │                │              │ work to track? │          │
│     └────────────────┘              └───────┬────────┘          │
│                                     ┌───────┴───────┐           │
│                                     ▼ YES           ▼ NO        │
│                              ┌──────────┐   ┌──────────────┐    │
│                              │  tasks/  │   │ Is it a RULE │    │
│                              │          │   │ or FACT?     │    │
│                              └──────────┘   └──────┬───────┘    │
│                                             ┌──────┴──────┐     │
│                                             ▼ YES         ▼ NO  │
│                                      ┌──────────┐  ┌──────────┐ │
│                                      │ memory/  │  │references/│ │
│                                      └──────────┘  └──────────┘ │
└─────────────────────────────────────────────────────────────────┘

Best Practices

Memory files are expensive — Keep them under 500 lines total
Use references for large docs — They're loaded only when needed
One concept per file — Easier to update and search
Add TOC to large files — For files > 100 lines, include a table of contents
Use consistent naming — snake_case.md for all files
Delete empty directories — Don't keep placeholder folders

Phase 3: Semantic Grouping (Optimized)

Automatically categorize your chunks into .agent/ folders:

python3 .agent/skills/context-optimizer/scripts/group_sections.py <split_dir> --move

This script analyzes each chunk for keywords and structural markers to suggest whether it belongs in memory/, workflows/, tasks/, or references/.

Phase 4: Organize into .agent Structure

...

Resource	Purpose
`scripts/decompose.py`	Split markdown by headers or custom regex
`scripts/group_sections.py`	Automatically categorize chunks by semantic analysis
`references/examples.md`	Real-world categorization examples and patterns

Related Skills

skill-creator: For creating new skills from decomposed content
documentation-mastery: For formatting the resulting markdown files

Source

git clone https://github.com/pablodiegoo/Data-Pro-Skill/blob/main/src/datapro/data/skills/context-optimizer/SKILL.mdView on GitHub

Overview

Transforms large Markdown docs into a modular .agent folder, optimizing context usage by separating Core Rules/Facts, Processes, Tasks, References, and Skills. This structure makes AI agents faster to reason with and easier to maintain.

How This Skill Works

The skill analyzes the source document to understand structure and density, then uses the decompose.py script to split content into memory/, workflows/, tasks/, references/, and skills/. After decomposition, you manually organize chunks into the .agent structure and add navigation files like 00_INDEX.md and 00_preamble.md for quick access.

When to Use It

Starting a new project with a large requirements document
Migrating legacy docs to .agent structure
Refactoring existing context files for better organization
Converting PDFs or long READMEs into agent-friendly files
Optimizing context window usage by splitting monolithic docs into Tasks, Memories, Workflows, and References

Quick Start

Step 1: Analyze the input document to assess heading depth and content density
Step 2: Run the decompose script, e.g. python3 .agent/skills/context-optimizer/scripts/decompose.py input.md -o output_dir -l 2
Step 3: Organize the resulting chunks into .agent/memory, .agent/workflows, .agent/tasks, .agent/references, and .agent/skills; add 00_INDEX.md and 00_preamble.md

Best Practices

Preview the source to assess hierarchical depth and content density before splitting
Use Phase 2 decompose with appropriate header level (-l) and optional regex (-r) patterns
Organize chunks into the canonical .agent folders: memory/, workflows/, tasks/, references/, skills/
Place immutable truths and conventions into memory/ (e.g., project_facts.md, conventions.md)
Create 00_INDEX.md and 00_preamble.md to improve navigation and load order

Example Use Cases

Split a large project spec into memory/project_facts.md, workflows/deploy.md, and tasks/backlog.md for an agent-ready context
Refactor a sprawling monolithic context into modular files across memory/, workflows/, and references/ for easier maintenance
Convert a multi-page PDF into agent-friendly files such as memory/conventions.md and references/api_docs.md
Break down a long README into Tasks, Memories, Workflows, and References to optimize context window usage
Migrate legacy docs to the .agent structure to support a new agent with clean separation of concerns

Frequently Asked Questions

Add this skill to your agents