What is the minimum duration for a run?

60 seconds per run, with 30 seconds allowed only for binary search phases.

Can benchmarks run in parallel?

No. perf-benchmarker enforces sequential execution to isolate overheads.

perf-benchmarker

Scanned

npx machina-cli add skill ComposioHQ/awesome-claude-plugins/benchmark --openclaw

Files (1)

SKILL.md

1.0 KB

perf-benchmarker

Run sequential benchmarks with strict duration rules.

Follow docs/perf-requirements.md as the canonical contract.

Required Rules

Benchmarks MUST run sequentially (never parallel).
Minimum duration: 60s per run (30s only for binary search).
Warmup: 10s minimum before measurement.
Re-run anomalies.

Output Format

command: <benchmark command>
duration: <seconds>
warmup: <seconds>
results: <metrics summary>
notes: <anomalies or reruns>

Output Contract

Benchmarks MUST emit a JSON metrics block between markers:

PERF_METRICS_START
{"scenarios":{"low":{"latency_ms":120},"high":{"latency_ms":450}}}
PERF_METRICS_END

Constraints

No short runs unless binary-search phase.
Do not change code while benchmarking.

Source

git clone https://github.com/ComposioHQ/awesome-claude-plugins/blob/master/perf/skills/benchmark/SKILL.mdView on GitHub

Overview

perf-benchmarker runs benchmarks sequentially to establish baselines and validate regressions under a canonical duration contract. It enforces minimum run durations, a warmup window, and disallows parallel benchmarks to isolate performance characteristics.

How This Skill Works

It executes a single benchmark command at a time, enforces sequential runs, and applies duration rules: minimum 60 seconds per run (30 seconds for binary search) with at least 10 seconds of warmup. Benchmarks must emit a PERF_METRICS block between PERF_METRICS_START and PERF_METRICS_END, which the tool captures and surfaces in the results output.

When to Use It

Establishing a baseline for a new release by running sequential benchmarks over a fixed duration.
Validating regressions after code changes by re-running benchmarks and comparing metrics.
Measuring isolated latency or throughput with a strict warmup and minimum duration.
Conducting binary-search style tuning with a 30-second run, when applicable.
Following the canonical perf contract described in docs/perf-requirements.md during benchmarking.

Quick Start

Step 1: Run the benchmark command using perf-benchmarker <command> [duration].
Step 2: Ensure a warmup of at least 10 seconds precedes measurement and that runs are not parallel.
Step 3: Read the PERF_METRICS_START/END block and the nested JSON metrics for results.

Best Practices

Ensure benchmarks run strictly sequentially, never in parallel.
Maintain the minimum duration: 60s per run, or 30s for binary search phases.
Include a 10-second warmup before measurement.
Re-run anomalies to confirm outlier behavior and stabilize results.
Do not modify code while benchmarking to preserve measurements.

Example Use Cases

Benchmark API latency on a new deployment and compare against the previous baseline using 60s+ runs.
Evaluate a data pipeline change by running successive benchmarks to detect regressions.
Tune a performance parameter with a binary-search style pass, keeping each run within 30s.
Capture a metrics JSON block from PERF_METRICS_START to PERF_METRICS_END for analysis.
Document results by reporting the command, duration, warmup, and metrics in a consistent format.

Frequently Asked Questions

Add this skill to your agents

perf-benchmarker

perf-benchmarker

Required Rules

Output Format

Output Contract

Constraints

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is the minimum duration for a run?

Can benchmarks run in parallel?

How are results reported?