Get the FREE Ultimate OpenClaw Setup Guide →

perf-benchmarker

npx machina-cli add skill ComposioHQ/awesome-claude-plugins/benchmark --openclaw
Files (1)
SKILL.md
1.0 KB

perf-benchmarker

Run sequential benchmarks with strict duration rules.

Follow docs/perf-requirements.md as the canonical contract.

Required Rules

  • Benchmarks MUST run sequentially (never parallel).
  • Minimum duration: 60s per run (30s only for binary search).
  • Warmup: 10s minimum before measurement.
  • Re-run anomalies.

Output Format

command: <benchmark command>
duration: <seconds>
warmup: <seconds>
results: <metrics summary>
notes: <anomalies or reruns>

Output Contract

Benchmarks MUST emit a JSON metrics block between markers:

PERF_METRICS_START
{"scenarios":{"low":{"latency_ms":120},"high":{"latency_ms":450}}}
PERF_METRICS_END

Constraints

  • No short runs unless binary-search phase.
  • Do not change code while benchmarking.

Source

git clone https://github.com/ComposioHQ/awesome-claude-plugins/blob/master/perf/skills/benchmark/SKILL.mdView on GitHub

Overview

perf-benchmarker runs benchmarks sequentially to establish baselines and validate regressions under a canonical duration contract. It enforces minimum run durations, a warmup window, and disallows parallel benchmarks to isolate performance characteristics.

How This Skill Works

It executes a single benchmark command at a time, enforces sequential runs, and applies duration rules: minimum 60 seconds per run (30 seconds for binary search) with at least 10 seconds of warmup. Benchmarks must emit a PERF_METRICS block between PERF_METRICS_START and PERF_METRICS_END, which the tool captures and surfaces in the results output.

When to Use It

  • Establishing a baseline for a new release by running sequential benchmarks over a fixed duration.
  • Validating regressions after code changes by re-running benchmarks and comparing metrics.
  • Measuring isolated latency or throughput with a strict warmup and minimum duration.
  • Conducting binary-search style tuning with a 30-second run, when applicable.
  • Following the canonical perf contract described in docs/perf-requirements.md during benchmarking.

Quick Start

  1. Step 1: Run the benchmark command using perf-benchmarker <command> [duration].
  2. Step 2: Ensure a warmup of at least 10 seconds precedes measurement and that runs are not parallel.
  3. Step 3: Read the PERF_METRICS_START/END block and the nested JSON metrics for results.

Best Practices

  • Ensure benchmarks run strictly sequentially, never in parallel.
  • Maintain the minimum duration: 60s per run, or 30s for binary search phases.
  • Include a 10-second warmup before measurement.
  • Re-run anomalies to confirm outlier behavior and stabilize results.
  • Do not modify code while benchmarking to preserve measurements.

Example Use Cases

  • Benchmark API latency on a new deployment and compare against the previous baseline using 60s+ runs.
  • Evaluate a data pipeline change by running successive benchmarks to detect regressions.
  • Tune a performance parameter with a binary-search style pass, keeping each run within 30s.
  • Capture a metrics JSON block from PERF_METRICS_START to PERF_METRICS_END for analysis.
  • Document results by reporting the command, duration, warmup, and metrics in a consistent format.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers