A framework to audit, benchmark, and validate Claude Code hooks before production deployment.

When should I use hooks-eval?

Before deploying hooks to production, to audit existing hooks for security vulnerabilities, performance, and compliance.

How does hooks-eval relate to other skills?

It complements hook-scope-guide for scope decisions and hook-authoring or validate-plugin for setup and structure checks.

hooks-eval

hook-management

npx machina-cli add skill athola/claude-night-market/hooks-eval --openclaw

Files (1)

SKILL.md

6.2 KB

Overview
Key Capabilities
Core Components
Quick Reference
Hook Event Types
Hook Callback Signature
Return Values
Quality Scoring (100 points)
Detailed Resources
Basic Evaluation Workflow
Integration with Other Tools
Related Skills

Hooks Evaluation Framework

Overview

This skill provides a detailed framework for evaluating, auditing, and implementing Claude Code hooks across all scopes (plugin, project, global) and both JSON-based and programmatic (Python SDK) hooks.

Key Capabilities

Security Analysis: Vulnerability scanning, dangerous pattern detection, injection prevention
Performance Analysis: Execution time benchmarking, resource usage, optimization
Compliance Checking: Structure validation, documentation requirements, best practices
SDK Integration: Python SDK hook types, callbacks, matchers, and patterns

Core Components

Component	Purpose
Hook Types Reference	Complete SDK hook event types and signatures
Evaluation Criteria	Scoring system and quality gates
Security Patterns	Common vulnerabilities and mitigations
Performance Benchmarks	Thresholds and optimization guidance

Quick Reference

Hook Event Types

HookEvent = Literal[
    "PreToolUse",       # Before tool execution
    "PostToolUse",      # After tool execution
    "UserPromptSubmit", # When user submits prompt
    "Stop",             # When stopping execution
    "SubagentStop",     # When a subagent stops
    "TeammateIdle",     # When teammate agent becomes idle (2.1.33+)
    "TaskCompleted",    # When a task finishes execution (2.1.33+)
    "PreCompact"        # Before message compaction
]

Verification: Run the command with --help flag to verify availability.

Note: Python SDK does not support SessionStart, SessionEnd, or Notification hooks due to setup limitations. However, plugins can define SessionStart hooks via hooks.json using shell commands (e.g., leyline's detect-git-platform.sh).

Plugin-Level hooks.json

Plugins can declare hooks via "hooks": "./hooks/hooks.json" in plugin.json. The evaluator validates:

Referenced hooks.json exists and is valid JSON
Shell commands referenced in hooks exist and are executable
Hook matchers use valid event types

Hook Callback Signature

async def my_hook(
    input_data: dict[str, Any],    # Hook-specific input
    tool_use_id: str | None,       # Tool ID (for tool hooks)
    context: HookContext           # Additional context
) -> dict[str, Any]:               # Return decision/messages
    ...

Verification: Run the command with --help flag to verify availability.

Return Values

return {
    "decision": "block",           # Optional: block the action
    "systemMessage": "...",        # Optional: add to transcript
    "hookSpecificOutput": {...}    # Optional: hook-specific data
}

Verification: Run the command with --help flag to verify availability.

Quality Scoring (100 points)

Category	Points	Focus
Security	30	Vulnerabilities, injection, validation
Performance	25	Execution time, memory, I/O
Compliance	20	Structure, documentation, error handling
Reliability	15	Timeouts, idempotency, degradation
Maintainability	10	Code structure, modularity

Detailed Resources

SDK Hook Types: See modules/sdk-hook-types.md for complete Python SDK type definitions, patterns, and examples
Evaluation Criteria: See modules/evaluation-criteria.md for detailed scoring rubric and quality gates
Security Patterns: See modules/sdk-hook-types.md for vulnerability detection and mitigation
Performance Guide: See modules/evaluation-criteria.md for benchmarking and optimization

Basic Evaluation Workflow

# 1. Run detailed evaluation
/hooks-eval --detailed

# 2. Focus on security issues
/hooks-eval --security-only --format sarif

# 3. Benchmark performance
/hooks-eval --performance-baseline

# 4. Check compliance
/hooks-eval --compliance-report

Verification: Run the command with --help flag to verify availability.

Integration with Other Tools

# Complete plugin evaluation pipeline
/hooks-eval --detailed          # Evaluate all hooks
/analyze-hook hooks/specific.py      # Deep-dive on one hook
/validate-plugin .                   # Validate overall structure

Verification: Run the command with --help flag to verify availability.

Related Skills

abstract:hook-scope-guide - Decide where to place hooks (plugin/project/global)
abstract:hook-authoring - Write hook rules and patterns
abstract:validate-plugin - Validate complete plugin structure

Troubleshooting

Common Issues

Hook not firing Verify hook pattern matches the event. Check hook logs for errors

Syntax errors Validate JSON/Python syntax before deployment

Permission denied Check hook file permissions and ownership

Source

git clone https://github.com/athola/claude-night-market/blob/master/plugins/abstract/skills/hooks-eval/SKILL.mdView on GitHub

Overview

This skill provides a structured framework to audit, implement, and secure Claude Code hooks across plugin, project, and global scopes. It covers security analysis, performance benchmarking, and compliance validation, with Python SDK integration for hook types, callbacks, and matchers. Use it before deploying hooks to production to catch vulnerabilities and optimize behavior.

How This Skill Works

It combines a Hook Types Reference, Evaluation Criteria, Security Patterns, and Performance Benchmarks to guide auditing and implementation. Evaluation uses Python SDK hooks, callbacks, and matchers to analyze input, output, and resource usage, producing tangible results such as security scans and performance analyses. It integrates with the hook-scope-guide for scope specific checks and enforces compliance standards.

When to Use It

Auditing existing hooks for security vulnerabilities before production deployment
Benchmarking hook execution time and resource usage against targets
Implementing or validating hooks using the Python SDK (hook types, callbacks, matchers)
Validating hooks for compliance, structure, and documentation requirements
Cross-checking with hook-scope-guide and related tooling during integration

Quick Start

Step 1: Identify the hooks to audit (scope plugin, project, or global) and prepare inputs
Step 2: Run the evaluation workflow using Python SDK hooks and matchers to collect data
Step 3: Review results, remediate issues, and document findings before production

Best Practices

Run security scans and pattern checks on all hook events
Verify hook callback signatures and consistent return structures
Benchmark performance across representative workloads and scenarios
Validate documentation, structure, and compliance requirements for each hook
Use the Python SDK hooks with defined matchers and avoid ad hoc implementations

Example Use Cases

Auditing PreToolUse and PostToolUse hooks across a plugin to detect unsafe patterns
Benchmarking a hook that calls an external service to ensure latency stays under thresholds
Implementing a new hook with the Python SDK and verifying its signature
Validating a plugin's hooks.json, ensuring referenced commands exist and are executable
Applying security patterns to detect injection risks in hook events

Frequently Asked Questions

Add this skill to your agents

hooks-eval

Table of Contents

Hooks Evaluation Framework

Overview

Key Capabilities

Core Components

Quick Reference

Hook Event Types

Plugin-Level hooks.json

Hook Callback Signature

Return Values

Quality Scoring (100 points)

Detailed Resources

Basic Evaluation Workflow

Integration with Other Tools

Related Skills

Troubleshooting

Common Issues

Source

Overview

How This Skill Works

When to Use It

Quick Start

Best Practices

Example Use Cases

Frequently Asked Questions

What is hooks-eval?

When should I use hooks-eval?

How does hooks-eval relate to other skills?