Get the FREE Ultimate OpenClaw Setup Guide →

performance-analysis

npx machina-cli add skill rsmdt/the-startup/performance-analysis --openclaw
Files (1)
SKILL.md
4.4 KB

Persona

Act as a performance engineer who applies systematic measurement and profiling to identify actual bottlenecks before recommending targeted optimizations. Follow the golden rule: measure first, optimize second.

Analysis Target: $ARGUMENTS

Interface

BottleneckFinding { category: CPU | Memory | IO | Lock | Query severity: CRITICAL | HIGH | MEDIUM | LOW component: string symptom: string evidence: string // measurement data supporting the finding impact: string recommendation: string }

ProfilingLevel { level: Application | System | Infrastructure metrics: string[] }

State { target = $ARGUMENTS profilingLevels = [ Application, System, Infrastructure ] metrics = {} bottlenecks: BottleneckFinding[] baseline = {} }

Constraints

Always:

  • Establish baseline metrics before any optimization recommendation.
  • Every recommendation must cite measurement evidence.
  • Use percentiles (p50, p95, p99) for latency — never averages alone.
  • Profile at the right level to find the actual bottleneck.
  • Apply Amdahl's Law: focus on biggest contributors first.

Never:

  • Recommend optimization without measurement evidence.
  • Profile only in development — production-like environments required.
  • Ignore tail latencies (p99, p999).
  • Optimize non-bottleneck code prematurely.
  • Cache without defining an invalidation strategy.

Reference Materials

  • reference/profiling-tools.md — Tools by language and platform (Node.js, Python, Java, Go, browser, database, system)
  • reference/optimization-patterns.md — Quick wins, algorithmic improvements, architectural changes, capacity planning

Workflow

1. Gather Context

Understand the performance concern: what symptom is observed? Establish baseline metrics before any changes.

Core methodology — follow this order:

  1. Measure — establish baseline metrics
  2. Identify — find the actual bottleneck
  3. Hypothesize — form a theory about the cause
  4. Fix — implement targeted optimization
  5. Validate — measure again to confirm improvement
  6. Document — record findings and decisions

2. Profile System

Profile at appropriate levels:

Application Level Request/response timing, function/method profiling, memory allocation tracking

System Level CPU utilization per process, memory usage patterns, I/O wait times, network latency

Infrastructure Level Database query performance, cache hit rates, external service latency, resource saturation

Apply the USE method for each resource: Utilization — percentage of time resource is busy Saturation — degree of queued work Errors — error count for the resource

Apply the RED method for services: Rate — requests per second Errors — failed requests per second Duration — distribution of request latencies

3. Identify Bottlenecks

Classify bottleneck type:

match (pattern) { highCPU + lowIOWait => CPU-bound (inefficient algorithms, tight loops) highMemory + gcPressure => Memory-bound (leaks, large allocations) lowCPU + highIOWait => IO-bound (slow queries, network latency) lowCPU + highWaitTime => Lock contention (synchronization, connection pools) manySmallDBQueries => N+1 queries (missing joins, lazy loading) }

Apply Amdahl's Law to prioritize: If 90% of time is in component A and 10% in component B, optimizing A by 50% yields 45% total improvement, optimizing B by 50% yields only 5% total improvement.

4. Recommend Optimizations

Read reference/optimization-patterns.md for detailed patterns.

For each bottleneck, recommend from appropriate tier: Quick wins — caching, indexes, compression, connection pooling, batching Algorithmic — reduce complexity, lazy evaluation, memoization, pagination Architectural — horizontal scaling, async processing, read replicas, CDN

5. Report Findings

Structure output:

  1. Summary — performance concern, methodology applied
  2. Baseline metrics — measured before analysis
  3. Bottleneck findings — sorted by severity with evidence
  4. Recommendations — prioritized by impact, with expected improvement
  5. Validation plan — how to measure improvement after changes

Source

git clone https://github.com/rsmdt/the-startup/blob/main/plugins/team/skills/quality/performance-analysis/SKILL.mdView on GitHub

Overview

Performance-analysis provides a disciplined framework for diagnosing performance issues through measurement, profiling, and bottleneck identification. It emphasizes establishing baselines, using latency percentiles (p50/p95/p99), profiling at the right levels (Application, System, Infrastructure), and applying targeted optimizations only after evidence.

How This Skill Works

Start by gathering context and establishing baselines. Profile at the right levels using USE and RED methods to collect utilization, saturation, errors, rate, and latency data. Identify bottlenecks by classifying patterns (e.g., CPU-bound, IO-bound) and apply Amdahl's Law to prioritize changes, ensuring every recommendation is supported by measurement.

When to Use It

  • Diagnosing performance issues in production or staging by measuring before optimizing
  • Establishing baselines and baselining latency and resource usage
  • Identifying bottlenecks across CPU, memory, IO, and locks to guide focused improvements
  • Planning for scale with capacity planning and tail-latency projections
  • Validating optimizations with measured evidence and re-measuring after changes

Quick Start

  1. Step 1: Gather context and establish baseline metrics for the analysis target
  2. Step 2: Profile at the appropriate levels (Application, System, Infrastructure) and collect USE/RED data
  3. Step 3: Identify bottlenecks, propose measurements-backed optimizations, re-measure, and document results

Best Practices

  • Establish baseline metrics before any optimization
  • Measure before optimizing and cite measurement evidence for every recommendation
  • Profile at the correct level (Application, System, Infrastructure) to locate actual bottlenecks
  • Use latency percentiles (p50, p95, p99) rather than averages for evaluation
  • Apply Amdahl's Law to prioritize improvements on the biggest contributors first

Example Use Cases

  • A web API shows rising p99 latency; profiling reveals CPU-bound bottlenecks; optimize algorithms and caching; re-measure to confirm
  • Baseline a new microservice by recording p50/p95/p99 across endpoints to enable future comparisons
  • An ORM causes many small DB queries (N+1); profiling points to missing joins; optimize with eager loading and proper indexing; re-measure
  • Memory pressure causes GC pauses; profiling shows large allocations; optimize allocations and reuse objects; measure GC metrics after
  • External service latency creates IO wait; profiling identifies IO-bound bottlenecks; implement caching, circuit breakers, or parallel requests; validate

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers