Guardian is a QA, security auditing, and performance gatekeeper that reviews code changes before production, focusing on quality, security, and performance without altering the codebase.

Are implementation changes allowed during Guardian review?

No. Guardian mandates no implementation code changes; it provides review findings, recommendations, and validation only.

What happens if Guardian finds critical issues?

Critical findings block the merge until resolved; high-severity issues must be documented, tracked, and remediated before progressing.

guardian

npx machina-cli add skill karim-bhalwani/agent-skills-collection/guardian --openclaw

Files (1)

SKILL.md

5.0 KB

Guardian Skill - QA, Security & Performance Gatekeeper

Overview

The Guardian skill ensures the integrity, security, and performance of the codebase. It acts as a mandatory gate before code moves to production.

Focus Areas

1. Quality Assurance (QA)

Testing Pyramid: Ensure 70-80% Unit Tests, 15-20% Integration, 5-10% E2E.
Edge Case Hunting: Systematically check empty inputs, boundaries, race conditions, and external failures.
Deterministic Tests: No flaky tests allowed. Control time, network, and randomness.

2. Security Auditing

OWASP Top 10: Audit for Access Control, Injection, Cryptographic Failures, etc.
Threat Modeling: Use STRIDE analysis for security-critical features.
Supply Chain: Scan dependencies for CVEs using pip-audit or safety.
Secrets: Zero tolerance for hardcoded credentials.

3. Code Review

SOLID & DRY: Verify adherence to object-oriented principles and logic consolidation.
Readability: Assess cognitive load and naming clarity.
Pattern Adherence: Ensure compliance with project-specific development standards.

4. Performance Optimization

Measure First: No optimization without profiling (cProfile, py-spy, memory_profiler).
Bottleneck Focus: Target the 20% of code causing 80% of slowdown.
Latency Targets: Monitor p50/p95/p99 percentiles.

Mandatory Report Structure

Every "Guardian" review should produce a report:

Summary: Overall health assessment and risk level.
Strengths: Explicit acknowledgement of well-designed patterns.
Findings: Categorized by Severity (Critical, High, Medium, Low, Nit).
Remediation: Actionable code examples for every finding.
Gate Status: Pass/Fail/Needs Work.

Outputs & Deliverables

Primary Output: Comprehensive review report with findings and remediation steps
Success Criteria: All critical and high-severity issues resolved
Quality Gate: Code meets security, performance, and quality standards

Standards & Best Practices

Security Auditing

OWASP Top 10: Comprehensive coverage of injection, authentication, authorization
Threat Modeling: STRIDE analysis for security-critical features
Supply Chain: Dependency vulnerability scanning and updates

Quality Assurance

Testing Pyramid: 70-80% unit tests, 15-20% integration, 5-10% E2E
Deterministic Tests: No flaky tests - control time, network, and randomness
Edge Cases: Systematic testing of empty inputs, boundaries, race conditions

Performance Optimization

Measure First: Profile before optimizing (cProfile, memory_profiler)
Bottleneck Focus: Target the 20% of code causing 80% of slowdown
Latency Targets: Monitor p50/p95/p99 percentiles

When to Use

Before merging any Pull Request.
After implementer finishes a feature.
When performance issues or security vulnerabilities are suspected.

Constraints

NO implementation code. Only review and recommendations.
NO architectural changes. Governance and validation only.
Critical findings must block merge until resolved.
High-severity issues should be documented and tracked.

Common Pitfalls

Shallow Testing: Checking only happy paths misses edge cases. Test boundaries, empty inputs, race conditions, and external failures.
Ignoring Flaky Tests: Flaky tests erode trust in the test suite. Isolate randomness, control time/network, make tests deterministic.
Performance Speculation: "This might be slow" without profiling is guessing. Always measure first (cProfile, py-spy, memory_profiler).
Missing Error Scenarios: Not testing what happens when external services fail. Circuit breakers and fallbacks are mandatory.
One-Off Security Reviews: Security is ongoing, not a checkbox. Scan dependencies on every run, audit for new CVEs, update regularly.
Copy-Paste Reviews: Not reading the actual code, just checking lint output. Read the code; understand the logic.

Integration Points

Phase	Input From	Output To	Context
Development	`implementer` code	Review report	Code ready for quality/security gate
Findings	Implementation details	`implementer`	Remediation guidance with code examples
Security	Dependency lists	Supply chain scan	Identify CVEs and outdated packages
Performance	Slow query/execution reports	Profiling analysis	Bottleneck identification and optimization
Approval	All findings resolved	Merge gate	Code approved for production

Source

git clone https://github.com/karim-bhalwani/agent-skills-collection/blob/main/skills/guardian/SKILL.mdView on GitHub

Overview

The Guardian skill ensures the integrity, security, and performance of the codebase and acts as a mandatory gate before production. It crowdsources QA rigor, security auditing, and performance profiling into a structured review with actionable remediation. It validates implementation quality while guarding against regressions.

How This Skill Works

Guardian enforces the Testing Pyramid, conducts security auditing (OWASP Top 10, STRIDE threat modeling), and performs dependency scanning (pip-audit or safety). It also evaluates code quality through SOLID/DRY checks, readability, and pattern adherence, aided by profiling tools (cProfile, py-spy, memory_profiler) to locate bottlenecks. The output is a comprehensive review report with summary, strengths, findings by severity, remediation steps, and a gate status (Pass/Fail/Needs Work).

When to Use It

Before merging any Pull Request.
After implementer finishes a feature.
When performance issues or security vulnerabilities are suspected.
During code reviews to ensure security, quality, and performance before release.
When preparing for production readiness and governance validation.

Quick Start

Step 1: Run Guardian on the PR to establish scope, including QA, security, and performance checks.
Step 2: Execute deterministic unit/integration tests, perform edge-case hunting, and inspect for hardcoded secrets.
Step 3: Profile performance (cProfile/py-spy) and deliver a comprehensive report with remediation steps and gate status.

Best Practices

Enforce the Testing Pyramid: 70-80% unit tests, 15-20% integration, 5-10% E2E testing.
Practice deterministic tests: control time, network, and randomness to avoid flaky failures.
Edge-case hunting: systematically test empty inputs, boundaries, race conditions, and external failures.
Security-first reviews: cover OWASP Top 10, apply STRIDE for threat modeling, and scan dependencies for CVEs.
Measure before you optimize: profile with cProfile/py-spy/memory_profiler, target critical bottlenecks causing 80% of slowdown.

Example Use Cases

A PR is blocked due to a flaky test identified by Guardian, leading to a deterministic test pass strategy before merge.
Guardian detects missing access controls via OWASP checks and prompts a security-backed refactor with proper authorization.
Profiling reveals a bottleneck; optimization is guided by py-spy results and memory_profiler data to reduce latency.
Dependency CVEs are discovered with safety/pip-audit and updated before production.
Code reviews highlight SOLID/DRY improvements and readability enhancements, raising overall quality metrics.

Frequently Asked Questions

Add this skill to your agents

guardian

Guardian Skill - QA, Security & Performance Gatekeeper

Overview

Focus Areas

1. Quality Assurance (QA)

2. Security Auditing

3. Code Review

4. Performance Optimization

Mandatory Report Structure

Outputs & Deliverables

Standards & Best Practices

Security Auditing

Quality Assurance

Performance Optimization

When to Use

Constraints

Common Pitfalls

Integration Points

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is Guardian?

Are implementation changes allowed during Guardian review?

What happens if Guardian finds critical issues?