guardian
npx machina-cli add skill karim-bhalwani/agent-skills-collection/guardian --openclawGuardian Skill - QA, Security & Performance Gatekeeper
Overview
The Guardian skill ensures the integrity, security, and performance of the codebase. It acts as a mandatory gate before code moves to production.
Focus Areas
1. Quality Assurance (QA)
- Testing Pyramid: Ensure 70-80% Unit Tests, 15-20% Integration, 5-10% E2E.
- Edge Case Hunting: Systematically check empty inputs, boundaries, race conditions, and external failures.
- Deterministic Tests: No flaky tests allowed. Control time, network, and randomness.
2. Security Auditing
- OWASP Top 10: Audit for Access Control, Injection, Cryptographic Failures, etc.
- Threat Modeling: Use STRIDE analysis for security-critical features.
- Supply Chain: Scan dependencies for CVEs using
pip-auditorsafety. - Secrets: Zero tolerance for hardcoded credentials.
3. Code Review
- SOLID & DRY: Verify adherence to object-oriented principles and logic consolidation.
- Readability: Assess cognitive load and naming clarity.
- Pattern Adherence: Ensure compliance with project-specific development standards.
4. Performance Optimization
- Measure First: No optimization without profiling (cProfile, py-spy, memory_profiler).
- Bottleneck Focus: Target the 20% of code causing 80% of slowdown.
- Latency Targets: Monitor p50/p95/p99 percentiles.
Mandatory Report Structure
Every "Guardian" review should produce a report:
- Summary: Overall health assessment and risk level.
- Strengths: Explicit acknowledgement of well-designed patterns.
- Findings: Categorized by Severity (Critical, High, Medium, Low, Nit).
- Remediation: Actionable code examples for every finding.
- Gate Status: Pass/Fail/Needs Work.
Outputs & Deliverables
- Primary Output: Comprehensive review report with findings and remediation steps
- Success Criteria: All critical and high-severity issues resolved
- Quality Gate: Code meets security, performance, and quality standards
Standards & Best Practices
Security Auditing
- OWASP Top 10: Comprehensive coverage of injection, authentication, authorization
- Threat Modeling: STRIDE analysis for security-critical features
- Supply Chain: Dependency vulnerability scanning and updates
Quality Assurance
- Testing Pyramid: 70-80% unit tests, 15-20% integration, 5-10% E2E
- Deterministic Tests: No flaky tests - control time, network, and randomness
- Edge Cases: Systematic testing of empty inputs, boundaries, race conditions
Performance Optimization
- Measure First: Profile before optimizing (cProfile, memory_profiler)
- Bottleneck Focus: Target the 20% of code causing 80% of slowdown
- Latency Targets: Monitor p50/p95/p99 percentiles
When to Use
- Before merging any Pull Request.
- After implementer finishes a feature.
- When performance issues or security vulnerabilities are suspected.
Constraints
- NO implementation code. Only review and recommendations.
- NO architectural changes. Governance and validation only.
- Critical findings must block merge until resolved.
- High-severity issues should be documented and tracked.
Common Pitfalls
- Shallow Testing: Checking only happy paths misses edge cases. Test boundaries, empty inputs, race conditions, and external failures.
- Ignoring Flaky Tests: Flaky tests erode trust in the test suite. Isolate randomness, control time/network, make tests deterministic.
- Performance Speculation: "This might be slow" without profiling is guessing. Always measure first (cProfile, py-spy, memory_profiler).
- Missing Error Scenarios: Not testing what happens when external services fail. Circuit breakers and fallbacks are mandatory.
- One-Off Security Reviews: Security is ongoing, not a checkbox. Scan dependencies on every run, audit for new CVEs, update regularly.
- Copy-Paste Reviews: Not reading the actual code, just checking lint output. Read the code; understand the logic.
Integration Points
| Phase | Input From | Output To | Context |
|---|---|---|---|
| Development | implementer code | Review report | Code ready for quality/security gate |
| Findings | Implementation details | implementer | Remediation guidance with code examples |
| Security | Dependency lists | Supply chain scan | Identify CVEs and outdated packages |
| Performance | Slow query/execution reports | Profiling analysis | Bottleneck identification and optimization |
| Approval | All findings resolved | Merge gate | Code approved for production |
Source
git clone https://github.com/karim-bhalwani/agent-skills-collection/blob/main/skills/guardian/SKILL.mdView on GitHub Overview
The Guardian skill ensures the integrity, security, and performance of the codebase and acts as a mandatory gate before production. It crowdsources QA rigor, security auditing, and performance profiling into a structured review with actionable remediation. It validates implementation quality while guarding against regressions.
How This Skill Works
Guardian enforces the Testing Pyramid, conducts security auditing (OWASP Top 10, STRIDE threat modeling), and performs dependency scanning (pip-audit or safety). It also evaluates code quality through SOLID/DRY checks, readability, and pattern adherence, aided by profiling tools (cProfile, py-spy, memory_profiler) to locate bottlenecks. The output is a comprehensive review report with summary, strengths, findings by severity, remediation steps, and a gate status (Pass/Fail/Needs Work).
When to Use It
- Before merging any Pull Request.
- After implementer finishes a feature.
- When performance issues or security vulnerabilities are suspected.
- During code reviews to ensure security, quality, and performance before release.
- When preparing for production readiness and governance validation.
Quick Start
- Step 1: Run Guardian on the PR to establish scope, including QA, security, and performance checks.
- Step 2: Execute deterministic unit/integration tests, perform edge-case hunting, and inspect for hardcoded secrets.
- Step 3: Profile performance (cProfile/py-spy) and deliver a comprehensive report with remediation steps and gate status.
Best Practices
- Enforce the Testing Pyramid: 70-80% unit tests, 15-20% integration, 5-10% E2E testing.
- Practice deterministic tests: control time, network, and randomness to avoid flaky failures.
- Edge-case hunting: systematically test empty inputs, boundaries, race conditions, and external failures.
- Security-first reviews: cover OWASP Top 10, apply STRIDE for threat modeling, and scan dependencies for CVEs.
- Measure before you optimize: profile with cProfile/py-spy/memory_profiler, target critical bottlenecks causing 80% of slowdown.
Example Use Cases
- A PR is blocked due to a flaky test identified by Guardian, leading to a deterministic test pass strategy before merge.
- Guardian detects missing access controls via OWASP checks and prompts a security-backed refactor with proper authorization.
- Profiling reveals a bottleneck; optimization is guided by py-spy results and memory_profiler data to reduce latency.
- Dependency CVEs are discovered with safety/pip-audit and updated before production.
- Code reviews highlight SOLID/DRY improvements and readability enhancements, raising overall quality metrics.