Get the FREE Ultimate OpenClaw Setup Guide →

perf-theory-tester

Scanned
npx machina-cli add skill ComposioHQ/awesome-claude-plugins/theory-tester --openclaw
Files (1)
SKILL.md
731 B

perf-theory-tester

Test hypotheses using controlled experiments.

Follow docs/perf-requirements.md as the canonical contract.

Required Steps

  1. Confirm baseline is clean.
  2. Apply a single change tied to the hypothesis.
  3. Run 2+ validation passes.
  4. Revert to baseline before the next experiment.

Output Format

hypothesis: <id>
change: <summary>
delta: <metrics>
verdict: accept|reject|inconclusive
evidence:
  - command: <benchmark command>
  - files: <changed files>

Constraints

  • One change per experiment.
  • No parallel benchmarks.
  • Record evidence for each run.

Source

git clone https://github.com/ComposioHQ/awesome-claude-plugins/blob/master/perf/skills/theory-tester/SKILL.mdView on GitHub

Overview

perf-theory-tester helps you test performance hypotheses with controlled experiments. It follows the canonical contract in docs/perf-requirements.md and emphasizes a clean baseline, a single change, 2+ validation passes, and reverting to baseline before the next test. Output is a structured report with hypothesis, change, delta, verdict, and evidence.

How This Skill Works

Prepare a clean baseline, apply a single change tied to your hypothesis, run at least two validation passes, then revert to baseline before the next experiment. Each run records the command and changed files; the tool emits a standardized output including hypothesis id, delta metrics, and a verdict (accept, reject, or inconclusive).

When to Use It

  • Validating a performance hypothesis with a controlled change
  • Measuring impact of a single code or config alteration
  • Ensuring reproducibility with 2+ validation passes
  • Documenting evidence for performance decisions
  • Isolating experiments by reverting to baseline before each test

Quick Start

  1. Step 1: Confirm baseline is clean.
  2. Step 2: Apply a single change tied to the hypothesis and run 2+ validation passes.
  3. Step 3: Revert to baseline before the next experiment.

Best Practices

  • Start with a clean baseline before each experiment
  • Apply only one change per experiment to avoid confounding effects
  • Run 2+ validation passes to reduce flakiness
  • Do not run benchmarks in parallel
  • Record evidence for each run using the standardized output format

Example Use Cases

  • Test latency impact of a cache optimization in a hot path
  • Evaluate performance effect of a new indexing strategy
  • Measure throughput change after a concurrency tweak
  • Assess memory footprint variation after a refactor
  • Validate I/O throughput improvements from an async path

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers