Get the FREE Ultimate OpenClaw Setup Guide →

prompt-compression

Scanned
npx machina-cli add skill a5c-ai/babysitter/prompt-compression --openclaw
Files (1)
SKILL.md
1.2 KB

Prompt Compression Skill

Capabilities

  • Implement token-efficient prompt compression
  • Design context pruning strategies
  • Configure selective context inclusion
  • Implement LLMLingua-style compression
  • Design summary-based compression
  • Create compression quality metrics

Target Processes

  • cost-optimization-llm
  • agent-performance-optimization

Implementation Details

Compression Techniques

  1. LLMLingua: Token-level compression
  2. Summary Compression: LLM-based summarization
  3. Selective Context: Relevant section extraction
  4. Token Pruning: Remove low-importance tokens
  5. Document Filtering: Pre-retrieval filtering

Configuration Options

  • Compression ratio targets
  • Quality threshold settings
  • Token budget constraints
  • Compression model selection
  • Evaluation metrics

Best Practices

  • Monitor quality vs compression tradeoff
  • Test with representative prompts
  • Set appropriate compression ratios
  • Validate compressed prompt quality
  • Track cost savings

Dependencies

  • llmlingua (optional)
  • tiktoken
  • transformers

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/prompt-compression/SKILL.mdView on GitHub

Overview

Prompt compression reduces prompt size to save tokens and cost while preserving essential meaning. It combines techniques such as LLMLingua-style token-level compression, summary-based compression, and selective context inclusion to trim verbosity without sacrificing performance.

How This Skill Works

The skill applies multiple techniques to compress prompts: token-level LLMLingua, LLM-based summarization, selective context extraction of relevant sections, token pruning, and pre-retrieval document filtering. Configuration options like compression ratio targets and token budgets let you balance quality and cost, while evaluation metrics help you validate compressed prompts.

When to Use It

  • You must operate under strict token budgets to reduce costs.
  • You need faster responses with smaller prompts.
  • Your data sources are long documents and reports.
  • You want to test compression models or adjust quality thresholds.
  • You want to filter documents before retrieval to reduce data.

Quick Start

  1. Step 1: Assess the prompt to identify high-token sections.
  2. Step 2: Choose a compression technique (LLMLingua, summary, selective context) and set targets.
  3. Step 3: Apply compression, run tests with representative prompts, and measure cost impact.

Best Practices

  • Monitor quality vs compression tradeoff
  • Test with representative prompts
  • Set appropriate compression ratios
  • Validate compressed prompt quality
  • Track cost savings

Example Use Cases

  • Apply LLMLingua-style compression to a multi-turn customer-support prompt to cut tokens.
  • Use summary-based compression to condense long incident reports into short briefs for LLMs.
  • Implement selective context extraction to feed only relevant sections from large policy docs.
  • Perform token pruning to remove low-importance tokens in recurring prompts.
  • Enable document filtering pre-retrieval to reduce retrieved data and costs.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers