datadog-automation
npx machina-cli add skill davepoon/buildwithclaude/datadog-automation --openclawDatadog Automation via Rube MCP
Automate Datadog monitoring and observability operations through Composio's Datadog toolkit via Rube MCP.
Toolkit docs: composio.dev/toolkits/datadog
Prerequisites
- Rube MCP must be connected (RUBE_SEARCH_TOOLS available)
- Active Datadog connection via
RUBE_MANAGE_CONNECTIONSwith toolkitdatadog - Always call
RUBE_SEARCH_TOOLSfirst to get current tool schemas
Setup
Get Rube MCP: Add https://rube.app/mcp as an MCP server in your client configuration. No API keys needed — just add the endpoint and it works.
- Verify Rube MCP is available by confirming
RUBE_SEARCH_TOOLSresponds - Call
RUBE_MANAGE_CONNECTIONSwith toolkitdatadog - If connection is not ACTIVE, follow the returned auth link to complete Datadog authentication
- Confirm connection status shows ACTIVE before running any workflows
Core Workflows
1. Query and Explore Metrics
When to use: User wants to query metric data or list available metrics
Tool sequence:
DATADOG_LIST_METRICS- List available metric names [Optional]DATADOG_QUERY_METRICS- Query metric time series data [Required]
Key parameters:
query: Datadog metric query string (e.g.,avg:system.cpu.user{host:web01})from: Start timestamp (Unix epoch seconds)to: End timestamp (Unix epoch seconds)q: Search string for listing metrics
Pitfalls:
- Query syntax follows Datadog's metric query format:
aggregation:metric_name{tag_filters} fromandtoare Unix epoch timestamps in seconds, not milliseconds- Valid aggregations:
avg,sum,min,max,count - Tag filters use curly braces:
{host:web01,env:prod} - Time range should not exceed Datadog's retention limits for the metric type
2. Search and Analyze Logs
When to use: User wants to search log entries or list log indexes
Tool sequence:
DATADOG_LIST_LOG_INDEXES- List available log indexes [Optional]DATADOG_SEARCH_LOGS- Search logs with query and filters [Required]
Key parameters:
query: Log search query using Datadog log query syntaxfrom: Start time (ISO 8601 or Unix timestamp)to: End time (ISO 8601 or Unix timestamp)sort: Sort order ('asc' or 'desc')limit: Number of log entries to return
Pitfalls:
- Log queries use Datadog's log search syntax:
service:web status:error - Search is limited to retained logs within the configured retention period
- Large result sets require pagination; check for cursor/page tokens
- Log indexes control routing and retention; filter by index if known
3. Manage Monitors
When to use: User wants to create, update, mute, or inspect monitors
Tool sequence:
DATADOG_LIST_MONITORS- List all monitors with filters [Required]DATADOG_GET_MONITOR- Get specific monitor details [Optional]DATADOG_CREATE_MONITOR- Create a new monitor [Optional]DATADOG_UPDATE_MONITOR- Update monitor configuration [Optional]DATADOG_MUTE_MONITOR- Silence a monitor temporarily [Optional]DATADOG_UNMUTE_MONITOR- Re-enable a muted monitor [Optional]
Key parameters:
monitor_id: Numeric monitor IDname: Monitor display nametype: Monitor type ('metric alert', 'service check', 'log alert', 'query alert', etc.)query: Monitor query defining the alert conditionmessage: Notification message with @mentionstags: Array of tag stringsthresholds: Alert threshold values (critical,warning,ok)
Pitfalls:
- Monitor
typemust match the query type; mismatches cause creation failures messagesupports @mentions for notifications (e.g.,@slack-channel,@pagerduty)- Thresholds vary by monitor type; metric monitors need
criticalat minimum - Muting a monitor suppresses notifications but the monitor still evaluates
- Monitor IDs are numeric integers
4. Manage Dashboards
When to use: User wants to list, view, update, or delete dashboards
Tool sequence:
DATADOG_LIST_DASHBOARDS- List all dashboards [Required]DATADOG_GET_DASHBOARD- Get full dashboard definition [Optional]DATADOG_UPDATE_DASHBOARD- Update dashboard layout or widgets [Optional]DATADOG_DELETE_DASHBOARD- Remove a dashboard (irreversible) [Optional]
Key parameters:
dashboard_id: Dashboard identifier stringtitle: Dashboard titlelayout_type: 'ordered' (grid) or 'free' (freeform positioning)widgets: Array of widget definition objectsdescription: Dashboard description
Pitfalls:
- Dashboard IDs are alphanumeric strings (e.g., 'abc-def-ghi'), not numeric
layout_typecannot be changed after creation; must recreate the dashboard- Widget definitions are complex nested objects; get existing dashboard first to understand structure
- DELETE is permanent; there is no undo
5. Create Events and Manage Downtimes
When to use: User wants to post events or schedule maintenance downtimes
Tool sequence:
DATADOG_LIST_EVENTS- List existing events [Optional]DATADOG_CREATE_EVENT- Post a new event [Required]DATADOG_CREATE_DOWNTIME- Schedule a maintenance downtime [Optional]
Key parameters for events:
title: Event titletext: Event body text (supports markdown)alert_type: Event severity ('error', 'warning', 'info', 'success')tags: Array of tag strings
Key parameters for downtimes:
scope: Tag scope for the downtime (e.g.,host:web01)start: Start time (Unix epoch)end: End time (Unix epoch; omit for indefinite)message: Downtime descriptionmonitor_id: Specific monitor to downtime (optional, omit for scope-based)
Pitfalls:
- Event
textsupports Datadog's markdown format including @mentions - Downtimes scope uses tag syntax:
host:web01,env:staging - Omitting
endcreates an indefinite downtime; always set an end time for maintenance - Downtime
monitor_idnarrows to a single monitor; scope applies to all matching monitors
6. Manage Hosts and Traces
When to use: User wants to list infrastructure hosts or inspect distributed traces
Tool sequence:
DATADOG_LIST_HOSTS- List all reporting hosts [Required]DATADOG_GET_TRACE_BY_ID- Get a specific distributed trace [Optional]
Key parameters:
filter: Host search filter stringsort_field: Sort hosts by field (e.g., 'name', 'apps', 'cpu')sort_dir: Sort direction ('asc' or 'desc')trace_id: Distributed trace ID for trace lookup
Pitfalls:
- Host list includes all hosts reporting to Datadog within the retention window
- Trace IDs are long numeric strings; ensure exact match
- Hosts that stop reporting are retained for a configured period before removal
Common Patterns
Monitor Query Syntax
Metric alerts:
avg(last_5m):avg:system.cpu.user{env:prod} > 90
Log alerts:
logs("service:web status:error").index("main").rollup("count").last("5m") > 10
Tag Filtering
- Tags use
key:valueformat:host:web01,env:prod,service:api - Multiple tags:
{host:web01,env:prod}(AND logic) - Wildcard:
host:web*
Pagination
- Use
pageandpage_sizeor offset-based pagination depending on endpoint - Check response for total count to determine if more pages exist
- Continue until all results are retrieved
Known Pitfalls
Timestamps:
- Most endpoints use Unix epoch seconds (not milliseconds)
- Some endpoints accept ISO 8601; check tool schema
- Time ranges should be reasonable (not years of data)
Query Syntax:
- Metric queries:
aggregation:metric{tags} - Log queries:
field:valuepairs - Monitor queries vary by type; check Datadog documentation
Rate Limits:
- Datadog API has per-endpoint rate limits
- Implement backoff on 429 responses
- Batch operations where possible
Quick Reference
| Task | Tool Slug | Key Params |
|---|---|---|
| Query metrics | DATADOG_QUERY_METRICS | query, from, to |
| List metrics | DATADOG_LIST_METRICS | q |
| Search logs | DATADOG_SEARCH_LOGS | query, from, to, limit |
| List log indexes | DATADOG_LIST_LOG_INDEXES | (none) |
| List monitors | DATADOG_LIST_MONITORS | tags |
| Get monitor | DATADOG_GET_MONITOR | monitor_id |
| Create monitor | DATADOG_CREATE_MONITOR | name, type, query, message |
| Update monitor | DATADOG_UPDATE_MONITOR | monitor_id |
| Mute monitor | DATADOG_MUTE_MONITOR | monitor_id |
| Unmute monitor | DATADOG_UNMUTE_MONITOR | monitor_id |
| List dashboards | DATADOG_LIST_DASHBOARDS | (none) |
| Get dashboard | DATADOG_GET_DASHBOARD | dashboard_id |
| Update dashboard | DATADOG_UPDATE_DASHBOARD | dashboard_id, title, widgets |
| Delete dashboard | DATADOG_DELETE_DASHBOARD | dashboard_id |
| List events | DATADOG_LIST_EVENTS | start, end |
| Create event | DATADOG_CREATE_EVENT | title, text, alert_type |
| Create downtime | DATADOG_CREATE_DOWNTIME | scope, start, end |
| List hosts | DATADOG_LIST_HOSTS | filter, sort_field |
| Get trace | DATADOG_GET_TRACE_BY_ID | trace_id |
Powered by Composio
Source
git clone https://github.com/davepoon/buildwithclaude/blob/main/plugins/all-skills/skills/datadog-automation/SKILL.mdView on GitHub Overview
Automate Datadog monitoring and observability tasks using Composio's Rube MCP toolkit. It covers querying metrics, searching logs, managing monitors and dashboards, and creating events and downtimes. Always start by querying Rube MCP schemas to ensure compatibility.
How This Skill Works
Connect Rube MCP to Datadog and use a sequence of tools (e.g., DATADOG_LIST_METRICS, DATADOG_QUERY_METRICS, DATADOG_LIST_LOG_INDEXES, DATADOG_SEARCH_LOGS, DATADOG_LIST_MONITORS, DATADOG_CREATE_MONITOR, etc.). Begin with RUBE_SEARCH_TOOLS to fetch current schemas, then establish an ACTIVE Datadog connection via RUBE_MANAGE_CONNECTIONS before running workflows.
When to Use It
- Query metric data or list available Datadog metrics
- Search logs or list log indexes to diagnose issues
- Create, update, mute, or inspect monitors (alerts)
- Create Datadog events and downtimes for maintenance or incidents
- Validate and maintain active Datadog connections before automation
Quick Start
- Step 1: Verify Rube MCP is available by calling RUBE_SEARCH_TOOLS and confirm a ACTIVE connection later via RUBE_MANAGE_CONNECTIONS
- Step 2: Use the appropriate DATADOG_* tools (e.g., DATADOG_LIST_METRICS, DATADOG_QUERY_METRICS, DATADOG_SEARCH_LOGS, DATADOG_LIST_MONITORS) as your workflow
- Step 3: Validate results, adjust queries or monitor configurations, and ensure the Datadog connection remains ACTIVE
Best Practices
- Always call RUBE_SEARCH_TOOLS first to get current tool schemas
- Ensure the Datadog connection is ACTIVE via RUBE_MANAGE_CONNECTIONS before workflows
- Follow Datadog query syntax for metrics and logs (e.g., avg:metric{tag})
- Handle large log results with pagination and respect retention limits
- Match monitor types to query types and use @mentions in messages when notifying
Example Use Cases
- Query average CPU usage for host:web01 over the last 4 hours: DATADOG_QUERY_METRICS with query 'avg:system.cpu.user{host:web01} @datapoints' and from/to timestamps
- Search logs for 'service:web status:error' within a 24-hour window using DATADOG_SEARCH_LOGS
- Create a metric alert monitor for high latency, specifying type 'metric alert' and a query with thresholds
- Mute a monitor before a deployment to avoid alert noise, using DATADOG_MUTE_MONITOR
- Create a downtime window for a service during a maintenance window and verify it is active
Frequently Asked Questions
Related Skills
terraform
chaterm/terminal-skills
Terraform 基础设施即代码
ansible
chaterm/terminal-skills
Ansible 自动化运维
monitoring
chaterm/terminal-skills
监控与告警
git-advanced
chaterm/terminal-skills
Git 高级操作
CI/CD Pipeline Security Expert
martinholovsky/claude-skills-generator
Expert in CI/CD pipeline design with focus on secret management, code signing, artifact security, and supply chain protection for desktop application builds
pagerduty-automation
davepoon/buildwithclaude
Automate PagerDuty tasks via Rube MCP (Composio): manage incidents, services, schedules, escalation policies, and on-call rotations. Always search tools first for current schemas.