Get the FREE Ultimate OpenClaw Setup Guide →

toon-format

npx machina-cli add skill msewell/agent-stuff/toon-format --openclaw
Files (1)
SKILL.md
5.4 KB

TOON Format

What TOON Is

TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model optimized for LLM context windows. It uses YAML-style indentation for objects and CSV-style tabular rows for uniform arrays. It achieves ~40% fewer tokens than JSON on mixed-structure data.

TOON is a wire format between harness and model — not a replacement for JSON in code.

When to Use TOON

Use TOON when injecting structured data into context and the data has repeated structure:

  • Tool call results that return arrays of objects (database rows, search results, file listings).
  • RAG document metadata and chunks (maximize documents per context window).
  • Few-shot examples with uniform shape.
  • API response payloads with tabular data.

Do not use TOON when:

  • Data is deeply nested and irregular (≤40% tabular eligibility) — JSON compact is smaller.
  • Data is flat homogeneous rows with no LLM parsing needed — CSV is 5–10% smaller.
  • The model or downstream consumer expects JSON explicitly.

Decision rule: if the data contains arrays of objects with mostly the same keys, use TOON. Otherwise, use JSON.

How to Encode Data as TOON

Encoding Tool Results

When a tool returns structured JSON, encode it as TOON before injecting into the conversation. Use the TypeScript SDK:

import { encode } from '@toon-format/toon'

const toon = encode(toolResult)

Wrap in a toon code fence when embedding in a message:

```toon
<encoded content>
```

Encoding RAG Context

Encode retrieved document metadata and content as TOON to fit more documents in the context window:

const contextPayload = encode({
  documents: retrievedDocs.map(doc => ({
    id: doc.id, score: doc.score, source: doc.source, content: doc.content
  }))
})

Encoding Few-Shot Examples

Use TOON for few-shot examples with uniform structure to save tokens:

const examples = encode({
  examples: [
    { input: 'extract invoice fields', output: 'vendor,amount,date' },
    { input: 'summarize meeting notes', output: 'attendees,decisions,action_items' }
  ]
})

How to Produce Valid TOON Output

When generating TOON yourself (not just consuming it), follow these rules:

  1. Show the header, let the model fill rows. Provide the tabular header with field names:

    results[N]{id,title,confidence}:
    

    Replace N with the actual row count.

  2. Use 2-space indentation. No tabs for nesting, no trailing whitespace.

  3. Match [N] to actual row count exactly. A mismatch signals truncation.

  4. Quote strings only when necessary. Quote only when the value contains commas, colons, leading/trailing whitespace, or is a reserved word (true, false, null), or is empty.

  5. Wrap output in a toon code fence so consumers can extract it reliably.

Core Syntax Quick Reference

Objectskey: value with 2-space indent, no braces:

user:
  id: 123
  name: Ada Lovelace

Primitive arrays — inline with count:

tags[3]: admin,ops,dev

Tabular arrays — field names declared once, values as rows:

users[3]{id,name,role}:
  1,Alice,admin
  2,Bob,engineer
  3,Carol,viewer

List arrays — for non-uniform objects (different key sets):

events[2]:
  - type: login
    ts: 1700000000
  - type: error
    ts: 1700000042
    code: 503

Validating TOON

Always decode model-generated TOON with strict mode (the default):

import { decode } from '@toon-format/toon'

const parsed = decode(toonString) // strict by default

Strip code fences before decoding. Handle parse errors explicitly — do not silently accept partial data.

Strict mode catches: array count mismatches (truncation), indentation violations, missing/extra columns, and escaping errors.

Advanced Options

Delimiter Selection

  • Comma (default): general use.
  • Tab (\t): numeric-heavy data, fewer quotes needed.
  • Pipe (|): data with frequent commas in strings.
const toon = encode(data, { delimiter: '\t' })

Key Folding

Collapse single-key wrapper chains into dotted paths to reduce nesting overhead:

const folded = encode(data, { keyFolding: 'safe' })
// response.data.metadata.items[2]: a,b

Use when API responses have deep single-key wrappers. Disable when structural clarity matters more than savings.

Token Estimation

Use the CLI --stats flag to quantify savings before committing:

npx @toon-format/cli input.json --stats

Anti-Patterns

  • Do not use TOON for deeply nested, irregular data — it may be larger than JSON.
  • Do not describe TOON syntax in prompts — show examples instead.
  • Do not skip strict-mode decoding in agent loops.
  • Do not use oversized few-shot examples (keep to 2–5 rows).
  • Do not forget to quote ambiguous values ("null", "true", "123" when they are strings).
  • Do not assume TOON is always faster to process — benchmark latency separately from token count.

References

  • references/toon-format-best-practices.md

Source

git clone https://github.com/msewell/agent-stuff/blob/main/skills/toon-format/SKILL.mdView on GitHub

Overview

TOON is a compact, human-readable encoding of JSON optimized for LLM context windows, using YAML-style indentation for objects and CSV-style rows for arrays, with ~40% fewer tokens on mixed-structure data. It acts as a wire format between harness and model, not a replacement for JSON in code.

How This Skill Works

Data is encoded to TOON using the TypeScript SDK (@toon-format/toon) and then embedded in a toon code fence for reliable consumption. This approach is ideal for injecting structured results, RAG metadata, and uniform few-shot examples into prompts while preserving structure and reducing token usage.

When to Use It

  • Injecting structured data into context when results are arrays of objects with common keys
  • Encoding tool results that return structured JSON before embedding
  • Preparing RAG document metadata and chunks to maximize context window
  • Producing structured output from LLMs or tools in TOON format
  • Advising on TOON integration decisions and best-fit use cases

Quick Start

  1. Step 1: Check data shape; if it’s an array of objects with similar keys, TOON is a good fit
  2. Step 2: Encode the data with the TypeScript SDK: const toon = encode(data)
  3. Step 3: Wrap the result in a toon code fence and insert into your prompt

Best Practices

  • Follow the decision rule: use TOON when data contains arrays of objects with mostly the same keys; otherwise use JSON
  • Encode tool results with the TypeScript SDK before injecting into context
  • Always wrap TOON output in a toon code fence for reliable extraction
  • Use 2-space indentation, no tabs or trailing whitespace
  • Declare tabular headers and keep the [N] row count exactly matching the data

Example Use Cases

  • Encoding a database query result (rows of objects) into TOON before appending to LLM context
  • Encoding RAG document metadata and content to fit more documents in the context window
  • Providing uniform-structure few-shot examples to save tokens
  • Encoding an API response payload that includes tabular data
  • Embedding an encoded TOON block in a chat message for downstream parsing

Frequently Asked Questions

Add this skill to your agents

Related Skills

Sponsor this space

Reach thousands of developers