agentic-devops
npx machina-cli add skill cacheforge-ai/cacheforge-skills/agentic-devops --openclawFiles (1)
SKILL.md
2.8 KB
When to use this skill
Use this skill when the user wants to:
- Run system diagnostics or health checks
- Manage Docker containers (status, logs, health, compose)
- Inspect running processes, ports, or resource hogs
- Analyze log files for errors, patterns, or frequency
- Check HTTP endpoint availability or port status
- Get a quick one-command system overview
Commands
Quick Diagnostics (start here)
# Full system health report — CPU, memory, disk, Docker, ports, errors, top processes
python3 skills/agentic-devops/devops.py diag
Docker Operations
# Container status overview
python3 skills/agentic-devops/devops.py docker status
# Tail container logs with pattern filtering
python3 skills/agentic-devops/devops.py docker logs <container> --tail 100 --grep "error|warn"
# Docker health summary (running, stopped, unhealthy)
python3 skills/agentic-devops/devops.py docker health
# Docker Compose service status
python3 skills/agentic-devops/devops.py docker compose-status --file docker-compose.yml
Process Management
# List processes sorted by resource usage
python3 skills/agentic-devops/devops.py proc list --sort cpu
# Show ports in use
python3 skills/agentic-devops/devops.py proc ports
# Detect zombie processes
python3 skills/agentic-devops/devops.py proc zombies
Log Analysis
# Analyze log file for error patterns
python3 skills/agentic-devops/devops.py logs analyze /var/log/syslog --pattern "error|fail|critical"
# Tail log file with highlighted patterns
python3 skills/agentic-devops/devops.py logs tail /var/log/app.log --highlight "ERROR|WARN"
# Frequency analysis of log patterns
python3 skills/agentic-devops/devops.py logs frequency /var/log/app.log --top 20
Health Checks
# Check HTTP endpoint health
python3 skills/agentic-devops/devops.py health check https://myapp.com/healthz
# Scan specific ports
python3 skills/agentic-devops/devops.py health ports 80,443,8080,5432
# System resource health (CPU, memory, disk)
python3 skills/agentic-devops/devops.py health system
Requirements
- Python 3.8+ (stdlib only, no external dependencies)
- Docker CLI (optional — Docker sections degrade gracefully if not installed)
- Standard Unix utilities (ps, ss/netstat)
Source
git clone https://github.com/cacheforge-ai/cacheforge-skills/blob/main/skills/agentic-devops/SKILL.mdView on GitHub Overview
This skill offers a production-ready set of commands for diagnostics, Docker management, process inspection, log analysis, and health monitoring. Built by engineers who run production, it helps you quickly understand system health and keep services reliable.
How This Skill Works
It runs with Python 3.8+ using only the standard library. Commands are executed via a single devops.py CLI that exposes diag, docker, proc, logs, and health subcommands to collect metrics, inspect containers, and fetch endpoint health.
When to Use It
- Run system diagnostics or health checks
- Manage Docker containers (status, logs, health, compose)
- Inspect running processes, ports, or resource hogs
- Analyze log files for errors, patterns, or frequency
- Check HTTP endpoint availability or port status
Quick Start
- Step 1: Run a full diagnostic: python3 skills/agentic-devops/devops.py diag
- Step 2: Check Docker status: python3 skills/agentic-devops/devops.py docker status
- Step 3: Run a health check on an endpoint: python3 skills/agentic-devops/devops.py health check https://myapp.com/healthz
Best Practices
- Start with a full diag to capture a baseline before any change
- Prefer Docker health and compose-status for container orchestration
- When analyzing logs, use pattern filtering and frequency analysis to prioritize issues
- Use proc commands to identify resource hogs and port usage
- Regularly run health checks on critical endpoints to detect regressions
Example Use Cases
- Run a full system health report to surface CPU, memory, disk, Docker, ports, errors, and top processes
- Check the status of all Docker containers and identify unhealthy or stopped services
- Tail application logs with pattern filtering to surface errors or warnings
- Inspect open ports to verify service exposure and detect conflicts
- Perform an HTTP health check on a critical endpoint to verify availability
Frequently Asked Questions
Add this skill to your agents