What is a performance baseline?

A snapshot of response times, throughput, and resource usage captured before migration to compare post-migration results.

Which tools can I use?

JMeter, Gatling, k6, Locust, Artillery, wrk via CLI are supported for baseline measurements.

How are thresholds used?

Thresholds define acceptable performance ranges and trigger alerts if regression exceeds tolerances, guiding migration decisions.

performance-baseline-capturer

npx machina-cli add skill a5c-ai/babysitter/performance-baseline-capturer --openclaw

Files (1)

SKILL.md

2.6 KB

Performance Baseline Capturer Skill

Captures comprehensive performance baselines before migration to enable post-migration regression comparison and SLA verification.

Purpose

Enable performance benchmarking for:

Response time measurement
Throughput baseline
Resource utilization tracking
Load test execution
Percentile calculation

Capabilities

1. Response Time Measurement

Capture response times
Measure latency percentiles
Track by endpoint
Document SLA targets

2. Throughput Baseline

Measure requests per second
Track concurrent users
Document peak capacity
Establish limits

3. Resource Utilization Tracking

Monitor CPU usage
Track memory consumption
Measure disk I/O
Record network usage

4. Load Test Execution

Run baseline load tests
Execute stress tests
Perform soak tests
Document results

5. Percentile Calculation

Calculate P50/P90/P95/P99
Track distribution
Identify outliers
Set thresholds

6. Regression Threshold Setting

Define acceptable ranges
Set alert thresholds
Document tolerances
Create comparison criteria

Tool Integrations

Tool	Purpose	Integration Method
JMeter	Load testing	CLI
Gatling	Performance testing	CLI
k6	Modern load testing	CLI
Locust	Python load testing	CLI
Artillery	Node.js testing	CLI
wrk	HTTP benchmarking	CLI

Output Schema

{
  "baselineId": "string",
  "timestamp": "ISO8601",
  "environment": {
    "name": "string",
    "resources": {}
  },
  "metrics": {
    "responseTime": {
      "p50": "number",
      "p90": "number",
      "p95": "number",
      "p99": "number",
      "mean": "number"
    },
    "throughput": {
      "requestsPerSecond": "number",
      "peakRps": "number",
      "concurrentUsers": "number"
    },
    "resources": {
      "cpu": {},
      "memory": {},
      "disk": {},
      "network": {}
    }
  },
  "thresholds": {
    "responseTime": {},
    "throughput": {},
    "errors": {}
  }
}

Integration with Migration Processes

migration-testing-strategy: Baseline establishment
performance-optimization-migration: Performance tracking

Related Skills

migration-validator: Post-migration comparison
test-coverage-analyzer: Test planning

Related Agents

performance-validation-agent: Performance verification
migration-testing-strategist: Test planning

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/code-migration-modernization/skills/performance-baseline-capturer/SKILL.md

View on GitHub

Overview

Captures comprehensive performance baselines prior to migration to enable post-migration regression checks and SLA verification. It records response times, throughput, resource utilization, load-test results, and percentile distributions to establish a repeatable benchmark.

How This Skill Works

The skill performs baseline measurements across response time, throughput, and resources, then runs baseline-load tests (and related tests such as stress and soak) using CLI tools like JMeter, Gatling, k6, Locust, Artillery, or wrk. It computes percentiles (P50, P90, P95, P99), tracks endpoints, and outputs the results in a structured schema for comparison with future migrations.

When to Use It

Before migrating to a new version or architecture to establish a performance benchmark
During migration planning to verify SLA targets and capacity needs
When setting regression thresholds for post-migration comparisons
For capacity planning and peak-load documentation in baseline terms
To create a repeatable baseline that supports future regression testing

Quick Start

Step 1: Define baseline scope (endpoints, workloads, tools) and target SLAs
Step 2: Run baseline tests using a CLI tool (JMeter, Gatling, k6, Locust, Artillery, or wrk) and collect metrics
Step 3: Calculate P50/P90/P95/P99, review thresholds, and store results in a structured Output Schema

Best Practices

Define exact endpoints, workloads, and SLA targets to baseline
Use a consistent environment and timing across baseline runs
Execute multiple iterations and capture percentile metrics (P50, P90, P95, P99)
Document thresholds, tolerances, and criteria for regression
Integrate baseline capture into the migration workflow for repeatability

Example Use Cases

Baseline capture before API gateway modernization to compare pre/post changes
E-commerce checkout microservice migration with SLA verification
Migration of a reporting service with regression testing against baseline
Web app modernization baseline for capacity planning and target staffing
Cloud-based migration with pre-migration performance benchmarks for post-migration comparison

Frequently Asked Questions

Add this skill to your agents