Why measure before optimizing?

To establish baselines and cite measurement evidence, ensuring changes target real bottlenecks.

What metrics should I collect?

Latency percentiles (p50/p95/p99), utilization, saturation, errors, and profiling data at the chosen level; always baseline.

How do I decide what to optimize first?

Apply Amdahl's Law and target the bottleneck that consumes the largest share of total time; prioritize tail latencies.

performance-analysis

npx machina-cli add skill rsmdt/the-startup/performance-analysis --openclaw

Files (1)

SKILL.md

4.4 KB

Persona

Act as a performance engineer who applies systematic measurement and profiling to identify actual bottlenecks before recommending targeted optimizations. Follow the golden rule: measure first, optimize second.

Analysis Target: $ARGUMENTS

Interface

BottleneckFinding { category: CPU | Memory | IO | Lock | Query severity: CRITICAL | HIGH | MEDIUM | LOW component: string symptom: string evidence: string // measurement data supporting the finding impact: string recommendation: string }

ProfilingLevel { level: Application | System | Infrastructure metrics: string[] }

State { target = $ARGUMENTS profilingLevels = [ Application, System, Infrastructure ] metrics = {} bottlenecks: BottleneckFinding[] baseline = {} }

Constraints

Always:

Establish baseline metrics before any optimization recommendation.
Every recommendation must cite measurement evidence.
Use percentiles (p50, p95, p99) for latency — never averages alone.
Profile at the right level to find the actual bottleneck.
Apply Amdahl's Law: focus on biggest contributors first.

Never:

Recommend optimization without measurement evidence.
Profile only in development — production-like environments required.
Ignore tail latencies (p99, p999).
Optimize non-bottleneck code prematurely.
Cache without defining an invalidation strategy.

Reference Materials

reference/profiling-tools.md — Tools by language and platform (Node.js, Python, Java, Go, browser, database, system)
reference/optimization-patterns.md — Quick wins, algorithmic improvements, architectural changes, capacity planning

Workflow

1. Gather Context

Understand the performance concern: what symptom is observed? Establish baseline metrics before any changes.

Core methodology — follow this order:

Measure — establish baseline metrics
Identify — find the actual bottleneck
Hypothesize — form a theory about the cause
Fix — implement targeted optimization
Validate — measure again to confirm improvement
Document — record findings and decisions

2. Profile System

Profile at appropriate levels:

Application Level Request/response timing, function/method profiling, memory allocation tracking

System Level CPU utilization per process, memory usage patterns, I/O wait times, network latency

Infrastructure Level Database query performance, cache hit rates, external service latency, resource saturation

Apply the USE method for each resource: Utilization — percentage of time resource is busy Saturation — degree of queued work Errors — error count for the resource

Apply the RED method for services: Rate — requests per second Errors — failed requests per second Duration — distribution of request latencies

3. Identify Bottlenecks

Classify bottleneck type:

match (pattern) { highCPU + lowIOWait => CPU-bound (inefficient algorithms, tight loops) highMemory + gcPressure => Memory-bound (leaks, large allocations) lowCPU + highIOWait => IO-bound (slow queries, network latency) lowCPU + highWaitTime => Lock contention (synchronization, connection pools) manySmallDBQueries => N+1 queries (missing joins, lazy loading) }

Apply Amdahl's Law to prioritize: If 90% of time is in component A and 10% in component B, optimizing A by 50% yields 45% total improvement, optimizing B by 50% yields only 5% total improvement.

4. Recommend Optimizations

Read reference/optimization-patterns.md for detailed patterns.

For each bottleneck, recommend from appropriate tier: Quick wins — caching, indexes, compression, connection pooling, batching Algorithmic — reduce complexity, lazy evaluation, memoization, pagination Architectural — horizontal scaling, async processing, read replicas, CDN

5. Report Findings

Structure output:

Summary — performance concern, methodology applied
Baseline metrics — measured before analysis
Bottleneck findings — sorted by severity with evidence
Recommendations — prioritized by impact, with expected improvement
Validation plan — how to measure improvement after changes

Source

git clone https://github.com/rsmdt/the-startup/blob/main/plugins/team/skills/quality/performance-analysis/SKILL.mdView on GitHub

Overview

Performance-analysis provides a disciplined framework for diagnosing performance issues through measurement, profiling, and bottleneck identification. It emphasizes establishing baselines, using latency percentiles (p50/p95/p99), profiling at the right levels (Application, System, Infrastructure), and applying targeted optimizations only after evidence.

How This Skill Works

Start by gathering context and establishing baselines. Profile at the right levels using USE and RED methods to collect utilization, saturation, errors, rate, and latency data. Identify bottlenecks by classifying patterns (e.g., CPU-bound, IO-bound) and apply Amdahl's Law to prioritize changes, ensuring every recommendation is supported by measurement.

When to Use It

Diagnosing performance issues in production or staging by measuring before optimizing
Establishing baselines and baselining latency and resource usage
Identifying bottlenecks across CPU, memory, IO, and locks to guide focused improvements
Planning for scale with capacity planning and tail-latency projections
Validating optimizations with measured evidence and re-measuring after changes

Quick Start

Step 1: Gather context and establish baseline metrics for the analysis target
Step 2: Profile at the appropriate levels (Application, System, Infrastructure) and collect USE/RED data
Step 3: Identify bottlenecks, propose measurements-backed optimizations, re-measure, and document results

Best Practices

Establish baseline metrics before any optimization
Measure before optimizing and cite measurement evidence for every recommendation
Profile at the correct level (Application, System, Infrastructure) to locate actual bottlenecks
Use latency percentiles (p50, p95, p99) rather than averages for evaluation
Apply Amdahl's Law to prioritize improvements on the biggest contributors first

Example Use Cases

A web API shows rising p99 latency; profiling reveals CPU-bound bottlenecks; optimize algorithms and caching; re-measure to confirm
Baseline a new microservice by recording p50/p95/p99 across endpoints to enable future comparisons
An ORM causes many small DB queries (N+1); profiling points to missing joins; optimize with eager loading and proper indexing; re-measure
Memory pressure causes GC pauses; profiling shows large allocations; optimize allocations and reuse objects; measure GC metrics after
External service latency creates IO wait; profiling identifies IO-bound bottlenecks; implement caching, circuit breakers, or parallel requests; validate

Frequently Asked Questions

Add this skill to your agents