patrol-monitoring
npx machina-cli add skill a5c-ai/babysitter/patrol-monitoring --openclawPatrol Monitoring
Overview
Continuous monitoring using Gas Town's Deacon/Witness pattern. The Deacon supervises overall health, the Witness manages per-rig agent lifecycle, and the Boot (Dog) watches the Deacon itself.
When to Use
- During active convoy execution
- When agents may become stuck or unresponsive
- For long-running multi-agent workflows
- When automated recovery is desired
Process
- Health check all active agents and convoys
- Detect stuck or unresponsive agents via heartbeats
- Recover - restart, reassign, or escalate as needed
- Report patrol findings with trend analysis
Monitoring Roles
- Deacon: Daemon supervisor, monitors overall health
- Witness: Per-rig lifecycle manager for workers
- Boot (Dog): Watches the Deacon every 5 minutes
Recovery Modes
- restart: Restart the stuck agent session
- reassign: Move beads to a different agent
- escalate: Alert human for manual intervention
Tool Use
Invoke via babysitter process: methodologies/gastown/gastown-patrol
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/methodologies/gastown/skills/patrol-monitoring/SKILL.mdView on GitHub Overview
Patrol Monitoring uses Gas Town's Deacon/Witness pattern to supervise agent health, detect stuck or unresponsive workers, and automatically recover. The Deacon oversees overall health, the Witness handles per-rig lifecycles, and the Boot (Dog) watches the Deacon to ensure liveness.
How This Skill Works
The Deacon runs as a daemon supervisor to monitor overall health, while the Witness manages per-rig worker lifecycles. The Boot (Dog) checks the Deacon every 5 minutes. When issues are detected, Recovery Modes (restart, reassign, escalate) are applied and patrol findings are reported with trend analysis.
When to Use It
- During active convoy execution
- When agents may become stuck or unresponsive
- For long-running multi-agent workflows
- When automated recovery is desired
- For real-time health monitoring of active agents
Quick Start
- Step 1: Invoke the patrol via the babysitter process: methodologies/gastown/gastown-patrol
- Step 2: Run continuous health checks on all active agents and convoys with heartbeat monitoring
- Step 3: If issues are detected, apply Recovery Modes (restart, reassign, escalate) and generate a patrol report with trend analysis
Best Practices
- Define clear health metrics and heartbeat intervals for all agents and convoys.
- Assign a dedicated Witness per rig to manage lifecycle events and recoveries.
- Configure the Boot (Dog) to poll the Deacon at a fixed cadence (e.g., every 5 minutes).
- Predefine Recovery Modes (restart, reassign, escalate) and automate their application when thresholds are crossed.
- Generate patrol reports with trend analysis and use them to drive proactive improvements.
Example Use Cases
- During a data-collection convoy, the Deacon monitors all workers; if a worker misses heartbeat, the Witness initiates a controlled restart.
- A stuck agent in a long-running workflow is detected via missing heartbeats and automatically restarted to restore progress.
- If an agent cannot complete a task, the system reassigns the beads to a different agent to maintain throughput.
- Repeated failures trigger escalation, alerting humans for manual intervention while automated recovery continues.
- Patrol reports highlight deteriorating health across convoys, enabling preemptive scaling or maintenance before failures occur.