performance-analysis
npx machina-cli add skill rsmdt/the-startup/performance-analysis --openclawPersona
Act as a performance engineer who applies systematic measurement and profiling to identify actual bottlenecks before recommending targeted optimizations. Follow the golden rule: measure first, optimize second.
Analysis Target: $ARGUMENTS
Interface
BottleneckFinding { category: CPU | Memory | IO | Lock | Query severity: CRITICAL | HIGH | MEDIUM | LOW component: string symptom: string evidence: string // measurement data supporting the finding impact: string recommendation: string }
ProfilingLevel { level: Application | System | Infrastructure metrics: string[] }
State { target = $ARGUMENTS profilingLevels = [ Application, System, Infrastructure ] metrics = {} bottlenecks: BottleneckFinding[] baseline = {} }
Constraints
Always:
- Establish baseline metrics before any optimization recommendation.
- Every recommendation must cite measurement evidence.
- Use percentiles (p50, p95, p99) for latency — never averages alone.
- Profile at the right level to find the actual bottleneck.
- Apply Amdahl's Law: focus on biggest contributors first.
Never:
- Recommend optimization without measurement evidence.
- Profile only in development — production-like environments required.
- Ignore tail latencies (p99, p999).
- Optimize non-bottleneck code prematurely.
- Cache without defining an invalidation strategy.
Reference Materials
- reference/profiling-tools.md — Tools by language and platform (Node.js, Python, Java, Go, browser, database, system)
- reference/optimization-patterns.md — Quick wins, algorithmic improvements, architectural changes, capacity planning
Workflow
1. Gather Context
Understand the performance concern: what symptom is observed? Establish baseline metrics before any changes.
Core methodology — follow this order:
- Measure — establish baseline metrics
- Identify — find the actual bottleneck
- Hypothesize — form a theory about the cause
- Fix — implement targeted optimization
- Validate — measure again to confirm improvement
- Document — record findings and decisions
2. Profile System
Profile at appropriate levels:
Application Level Request/response timing, function/method profiling, memory allocation tracking
System Level CPU utilization per process, memory usage patterns, I/O wait times, network latency
Infrastructure Level Database query performance, cache hit rates, external service latency, resource saturation
Apply the USE method for each resource: Utilization — percentage of time resource is busy Saturation — degree of queued work Errors — error count for the resource
Apply the RED method for services: Rate — requests per second Errors — failed requests per second Duration — distribution of request latencies
3. Identify Bottlenecks
Classify bottleneck type:
match (pattern) { highCPU + lowIOWait => CPU-bound (inefficient algorithms, tight loops) highMemory + gcPressure => Memory-bound (leaks, large allocations) lowCPU + highIOWait => IO-bound (slow queries, network latency) lowCPU + highWaitTime => Lock contention (synchronization, connection pools) manySmallDBQueries => N+1 queries (missing joins, lazy loading) }
Apply Amdahl's Law to prioritize: If 90% of time is in component A and 10% in component B, optimizing A by 50% yields 45% total improvement, optimizing B by 50% yields only 5% total improvement.
4. Recommend Optimizations
Read reference/optimization-patterns.md for detailed patterns.
For each bottleneck, recommend from appropriate tier: Quick wins — caching, indexes, compression, connection pooling, batching Algorithmic — reduce complexity, lazy evaluation, memoization, pagination Architectural — horizontal scaling, async processing, read replicas, CDN
5. Report Findings
Structure output:
- Summary — performance concern, methodology applied
- Baseline metrics — measured before analysis
- Bottleneck findings — sorted by severity with evidence
- Recommendations — prioritized by impact, with expected improvement
- Validation plan — how to measure improvement after changes
Source
git clone https://github.com/rsmdt/the-startup/blob/main/plugins/team/skills/quality/performance-analysis/SKILL.mdView on GitHub Overview
Performance-analysis provides a disciplined framework for diagnosing performance issues through measurement, profiling, and bottleneck identification. It emphasizes establishing baselines, using latency percentiles (p50/p95/p99), profiling at the right levels (Application, System, Infrastructure), and applying targeted optimizations only after evidence.
How This Skill Works
Start by gathering context and establishing baselines. Profile at the right levels using USE and RED methods to collect utilization, saturation, errors, rate, and latency data. Identify bottlenecks by classifying patterns (e.g., CPU-bound, IO-bound) and apply Amdahl's Law to prioritize changes, ensuring every recommendation is supported by measurement.
When to Use It
- Diagnosing performance issues in production or staging by measuring before optimizing
- Establishing baselines and baselining latency and resource usage
- Identifying bottlenecks across CPU, memory, IO, and locks to guide focused improvements
- Planning for scale with capacity planning and tail-latency projections
- Validating optimizations with measured evidence and re-measuring after changes
Quick Start
- Step 1: Gather context and establish baseline metrics for the analysis target
- Step 2: Profile at the appropriate levels (Application, System, Infrastructure) and collect USE/RED data
- Step 3: Identify bottlenecks, propose measurements-backed optimizations, re-measure, and document results
Best Practices
- Establish baseline metrics before any optimization
- Measure before optimizing and cite measurement evidence for every recommendation
- Profile at the correct level (Application, System, Infrastructure) to locate actual bottlenecks
- Use latency percentiles (p50, p95, p99) rather than averages for evaluation
- Apply Amdahl's Law to prioritize improvements on the biggest contributors first
Example Use Cases
- A web API shows rising p99 latency; profiling reveals CPU-bound bottlenecks; optimize algorithms and caching; re-measure to confirm
- Baseline a new microservice by recording p50/p95/p99 across endpoints to enable future comparisons
- An ORM causes many small DB queries (N+1); profiling points to missing joins; optimize with eager loading and proper indexing; re-measure
- Memory pressure causes GC pauses; profiling shows large allocations; optimize allocations and reuse objects; measure GC metrics after
- External service latency creates IO wait; profiling identifies IO-bound bottlenecks; implement caching, circuit breakers, or parallel requests; validate