Get the FREE Ultimate OpenClaw Setup Guide →

agent-observability

npx machina-cli add skill BagelHole/DevOps-Security-Agent-Skills/agent-observability --openclaw
Files (1)
SKILL.md
1.1 KB

Agent Observability

Monitor AI agent behavior with logs, traces, metrics, and cost telemetry.

Track Core Signals

  • Request latency (p50/p95/p99)
  • Token usage (prompt/completion/cached)
  • Tool call success and failure rates
  • Cost per task and per customer
  • Hallucination and retry frequency

Implementation Pattern

  1. Add trace IDs to every user request.
  2. Capture each LLM call and tool call as child spans.
  3. Emit structured logs with model, temperature, and response status.
  4. Create SLOs for success rate and median response time.

Best Practices

  • Redact PII before exporting traces.
  • Keep a replayable request envelope for incident review.
  • Alert on abnormal token spikes and tool error bursts.

Related Skills

Source

git clone https://github.com/BagelHole/DevOps-Security-Agent-Skills/blob/main/devops/ai/agent-observability/SKILL.mdView on GitHub

Overview

Agent observability provides end-to-end visibility into AI agent behavior by tracing requests, tracking token usage, latency, and cost telemetry. It enables reliability, faster debugging, and informed incident response.

How This Skill Works

Add trace IDs to every user request and model inputs. Capture each LLM call and tool call as child spans, emitting structured logs with model, temperature, and response status. Define SLOs for success rate and median latency to drive reliability.

When to Use It

  • Diagnose slow or failing AI agent responses
  • Understand token usage and cost per task or per customer
  • Monitor tool call reliability, retries, and failures
  • Detect hallucinations and abnormal latency spikes
  • Perform post-incident reviews with replayable request envelopes

Quick Start

  1. Step 1: Add trace IDs to every incoming user request
  2. Step 2: Capture each LLM call and tool interaction as child spans and emit structured logs
  3. Step 3: Create SLOs for median latency and success rate, and build dashboards

Best Practices

  • Redact PII before exporting traces
  • Keep a replayable request envelope for incident review
  • Alert on abnormal token spikes and bursts of tool errors
  • Instrument LLM and tool calls with structured logs and spans
  • Define and monitor SLOs for success rate and median response time

Example Use Cases

  • An AI assistant tracks p95 latency and per-task token costs to optimize pricing and performance
  • Incident review includes a replayable envelope showing request, model config, and outcomes
  • Costs are surfaced per customer, helping teams identify expensive workflows
  • Tool-call success/failure rates are monitored to reduce user-visible failures
  • Redacted traces are exported to a centralized observability platform for audits

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers