Get the FREE Ultimate OpenClaw Setup Guide →

context-engineering

Scanned
npx machina-cli add skill Byunk/minimal-claude-code/context-engineering --openclaw
Files (1)
SKILL.md
4.7 KB

Context Engineering

Principles for maximizing LLM effectiveness by treating context as a finite resource.

Core Principle

Find the smallest possible set of high-signal tokens that maximize the likelihood of your desired outcome.

The Context Budget

LLMs have an "attention budget" that depletes with each token. Context rot causes recall accuracy to decrease as token count grows. Every design decision should optimize for signal density.

Quick Reference

ChallengeStrategyReference
Too many toolsCurate minimal viable setTool
Ambiguous tool selectionSelf-contained, unambiguous toolsTool
Context pollution over timeCompaction and summarizationAgent
Long-horizon tasksExternal memory and note-takingAgent
Exceeding single context limitsSub-agent architecturesMulti-Agent
MCP server bloatToken-efficient responsesMCP
Measuring effectivenessEnd-state evaluationEvaluation

Single vs Multi-Agent

Multi-agent adds ~15x token overhead. Use single agent unless:

FactorSingle AgentMulti-Agent
ParallelizationSequential stepsIndependent subtasks
Context sizeFits in windowExceeds single context
Tool complexityFocused toolsetMany specialized tools
DependenciesSteps depend on each otherWork can be isolated

Default to single agent. Add agents only when parallelization or context limits demand it.

Decision Checklists

Before Adding to Context

  • Is this the minimum information needed?
  • Can an agent discover this just-in-time instead?
  • Does this justify its token cost?

Tool Design

  • Can a human definitively say which tool to use?
  • Does each tool have a distinct, non-overlapping purpose?
  • Are responses token-efficient with high signal?
  • Do error messages guide toward solutions?

Agent Design

  • Does the system prompt strike the right altitude?
  • Are there mechanisms for compaction when context grows?
  • Is external memory used for long-horizon tracking?
  • Are canonical examples provided instead of exhaustive rules?

Multi-Agent

  • Is the task parallelizable enough to justify coordination overhead?
  • Do sub-agents return condensed summaries (not raw results)?
  • Is there clear separation of concerns between agents?

Key Techniques

Just-in-Time Retrieval

Keep lightweight identifiers (paths, queries, links). Load data dynamically at runtime rather than pre-loading everything upfront.

Progressive Disclosure

Let agents discover context through exploration. File sizes suggest complexity; naming hints at purpose. Each interaction yields context for the next decision.

Compaction

Summarize conversations nearing limits. Preserve architectural decisions and critical details; discard redundant tool outputs and verbose messages.

Structured Note-Taking

Persist notes to external memory (to-do lists, NOTES.md). Pull back into context when needed. Tracks progress without exhausting working context.

Sub-Agent Distribution

Delegate focused tasks to specialized agents with clean context windows. Each sub-agent explores extensively but returns only condensed summaries (1000-2000 tokens).

The Golden Rule

Do the simplest thing that works. Start minimal, add complexity only based on observed failure modes.

References

  • Tool - Building self-contained, token-efficient tools
  • Agent - Single agent context management
  • Multi-Agent - Coordinating multiple agents
  • MCP - Model Context Protocol best practices
  • Evaluation - Measuring context engineering effectiveness

Examples

Complete examples from Claude Code:

Tool Descriptions

  • Bash - Boundaries, when NOT to use, good/bad examples
  • Edit - Prerequisites, error guidance, concise design
  • Grep - Exclusivity, parameter examples, output modes

Agent Prompts

  • Explore - Role definition, constraints, strengths
  • Plan - Process steps, output format, boundaries
  • Summarization - Compaction structure, what to preserve

Source

git clone https://github.com/Byunk/minimal-claude-code/blob/main/minimal-claude-code/skills/context-engineering/SKILL.mdView on GitHub

Overview

Context engineering treats context as a finite resource to maximize LLM effectiveness. It emphasizes smallest high-signal token sets, token-efficient design, and external memory or sub-agents when needed. This approach suits LLM tools, agents, MCP servers, and multi-agent systems.

How This Skill Works

Identify the smallest high-signal token set that achieves the desired outcome, and manage the attention budget by minimizing redundant context. Use Just-in-Time Retrieval to load data at runtime, Progressive Disclosure to reveal context gradually, and Compaction to prune old outputs. For long-horizon tasks, persist notes externally and distribute work to sub-agents when needed, always preferring a single agent unless parallelism or context limits require multiple agents.

When to Use It

  • Designing LLM tools, agents, MCP servers, or multi-agent systems
  • Dealing with ambiguous or bloated tool sets that need clear boundaries
  • Preventing context pollution as conversations accumulate
  • Managing long-horizon tasks that require external memory or notes
  • When tasks push beyond a single context window and sub-agents are needed

Quick Start

  1. Step 1: Audit your tools and reduce to a minimal, non-overlapping set
  2. Step 2: Enable Just-in-Time data retrieval and external memory for long-horizon tasks
  3. Step 3: Implement periodic compaction and progressive disclosure as context grows

Best Practices

  • Find the smallest high-signal token set to maximize outcome
  • Use Just-in-Time Retrieval to load data on demand
  • Apply Progressive Disclosure to reveal context incrementally
  • Apply Compaction to summarize near-context limits and preserve essentials
  • Use Structured Note-Taking and Sub-Agent Distribution for long-horizon tasks; default to a single agent

Example Use Cases

  • Designing a minimal LLM toolset with self-contained, non-overlapping tools to avoid context bloat
  • An MCP server that delivers token-efficient responses within a multi-tool workflow
  • A long-horizon research assistant storing progress in NOTES.md and pulling context as needed
  • A suite of sub-agents handling distinct subtasks and returning condensed summaries
  • A context-compaction pass that trims prior messages while preserving decisions

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers