datadog-cli
Scannednpx machina-cli add skill softaworks/agent-toolkit/datadog-cli --openclawDatadog CLI
A CLI tool for AI agents to debug and triage using Datadog logs and metrics.
Required Reading
You MUST read the relevant reference docs before using any command:
Setup
Environment Variables (Required)
export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"
Get keys from: https://app.datadoghq.com/organization-settings/api-keys
Running the CLI
npx @leoflores/datadog-cli <command>
For non-US Datadog sites, use --site flag:
npx @leoflores/datadog-cli logs search --query "*" --site datadoghq.eu
Commands Overview
| Command | Description |
|---|---|
logs search | Search logs with filters |
logs tail | Stream logs in real-time |
logs trace | Find logs for a distributed trace |
logs context | Get logs before/after a timestamp |
logs patterns | Group similar log messages |
logs compare | Compare log counts between periods |
logs multi | Run multiple queries in parallel |
logs agg | Aggregate logs by facet |
metrics query | Query timeseries metrics |
errors | Quick error summary by service/type |
services | List services with log activity |
dashboards | Manage dashboards (CRUD) |
dashboard-lists | Manage dashboard lists |
Quick Examples
Search Errors
npx @leoflores/datadog-cli logs search --query "status:error" --from 1h --pretty
Tail Logs (Real-time)
npx @leoflores/datadog-cli logs tail --query "service:api status:error" --pretty
Error Summary
npx @leoflores/datadog-cli errors --from 1h --pretty
Trace Correlation
npx @leoflores/datadog-cli logs trace --id "abc123def456" --pretty
Query Metrics
npx @leoflores/datadog-cli metrics query --query "avg:system.cpu.user{*}" --from 1h --pretty
Compare Periods
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty
Global Flags
| Flag | Description |
|---|---|
--pretty | Human-readable output with colors |
--output <file> | Export results to JSON file |
--site <site> | Datadog site (e.g., datadoghq.eu) |
Time Formats
- Relative:
30m,1h,6h,24h,7d - ISO 8601:
2024-01-15T10:30:00Z
Incident Triage Workflow
# 1. Quick error overview
npx @leoflores/datadog-cli errors --from 1h --pretty
# 2. Is this new? Compare to previous period
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty
# 3. Find error patterns
npx @leoflores/datadog-cli logs patterns --query "status:error" --from 1h --pretty
# 4. Narrow down by service
npx @leoflores/datadog-cli logs search --query "status:error service:api" --from 1h --pretty
# 5. Get context around a timestamp
npx @leoflores/datadog-cli logs context --timestamp "2024-01-15T10:30:00Z" --service api --pretty
# 6. Follow the distributed trace
npx @leoflores/datadog-cli logs trace --id "TRACE_ID" --pretty
See workflows.md for more debugging workflows.
Source
git clone https://github.com/softaworks/agent-toolkit/blob/main/skills/datadog-cli/SKILL.mdView on GitHub Overview
Datadog CLI enables AI agents to debug and triage using Datadog logs and metrics. It supports searching logs, querying metrics, tracing requests, and managing dashboards, following official references. Use it to diagnose production issues faster and gain insight across your observability stack.
How This Skill Works
Run via npx @leoflores/datadog-cli <command> after exporting DD_API_KEY and DD_APP_KEY. It exposes commands for logs (search, tail, trace, context, patterns, compare, multi, agg), metrics query, and dashboards management. For non-US sites, use --site and consult the reference docs for syntax.
When to Use It
- Triage a production incident quickly by querying logs and traces
- Compare error or event counts between time windows to detect anomalies
- Investigate a service by filtering logs and correlating with metrics
- Trace a specific request by using a trace ID to fetch related logs
- Prepare and monitor dashboards during an incident for real-time visibility
Quick Start
- Step 1: Export DD_API_KEY and DD_APP_KEY from your Datadog account
- Step 2: Run a sample command, e.g., npx @leoflores/datadog-cli logs search --query 'status:error' --from 1h --pretty
- Step 3: Optionally add --site for non-US regions or --output to save results
Best Practices
- Read the relevant reference docs (logs, metrics, query syntax, workflows, dashboards) before running commands
- Start with high-signal commands like errors or logs compare to surface issues
- Use --pretty for human-friendly output and --output to export results
- Narrow results with time windows (--from, --period) and filters (service, status)
- Use --site for non-US Datadog sites and export credentials securely
Example Use Cases
- npx @leoflores/datadog-cli logs search --query 'status:error' --from 1h --pretty
- npx @leoflores/datadog-cli logs tail --query 'service:api status:error' --pretty
- npx @leoflores/datadog-cli errors --from 1h --pretty
- npx @leoflores/datadog-cli logs trace --id 'TRACE_ID' --pretty
- npx @leoflores/datadog-cli metrics query --query 'avg:system.cpu.user{*}' --from 1h --pretty