azure-diagnostics
npx machina-cli add skill microsoft/GitHub-Copilot-for-Azure/azure-diagnostics --openclawAzure Diagnostics
AUTHORITATIVE GUIDANCE — MANDATORY COMPLIANCE
This document is the official source for debugging and troubleshooting Azure production issues. Follow these instructions to diagnose and resolve common Azure service problems systematically.
Triggers
Activate this skill when user wants to:
- Debug or troubleshoot production issues
- Diagnose errors in Azure services
- Analyze application logs or metrics
- Fix image pull, cold start, or health probe issues
- Investigate why Azure resources are failing
- Find root cause of application errors
- Troubleshoot Azure Function Apps (invocation failures, timeouts, binding errors)
- Find the App Insights or Log Analytics workspace linked to a Function App
Rules
- Start with systematic diagnosis flow
- Use AppLens (MCP) for AI-powered diagnostics when available
- Check resource health before deep-diving into logs
- Select appropriate troubleshooting guide based on service type
- Document findings and attempted remediation steps
Quick Diagnosis Flow
- Identify symptoms - What's failing?
- Check resource health - Is Azure healthy?
- Review logs - What do logs show?
- Analyze metrics - Performance patterns?
- Investigate recent changes - What changed?
Troubleshooting Guides by Service
| Service | Common Issues | Reference |
|---|---|---|
| Container Apps | Image pull failures, cold starts, health probes, port mismatches | container-apps/ |
| Function Apps | App details, invocation failures, timeouts, binding errors, cold starts, missing app settings | functions/ |
Quick Reference
Common Diagnostic Commands
# Check resource health
az resource show --ids RESOURCE_ID
# View activity log
az monitor activity-log list -g RG --max-events 20
# Container Apps logs
az containerapp logs show --name APP -g RG --follow
# Function App logs (query App Insights traces)
az monitor app-insights query --apps APP-INSIGHTS -g RG \
--analytics-query "traces | where timestamp > ago(1h) | order by timestamp desc | take 50"
AppLens (MCP Tools)
For AI-powered diagnostics, use:
mcp_azure_mcp_applens
intent: "diagnose issues with <resource-name>"
command: "diagnose"
parameters:
resourceId: "<resource-id>"
Provides:
- Automated issue detection
- Root cause analysis
- Remediation recommendations
Azure Monitor (MCP Tools)
For querying logs and metrics:
mcp_azure_mcp_monitor
intent: "query logs for <resource-name>"
command: "logs_query"
parameters:
workspaceId: "<workspace-id>"
query: "<KQL-query>"
See kql-queries.md for common diagnostic queries.
Check Azure Resource Health
Using MCP
mcp_azure_mcp_resourcehealth
intent: "check health status of <resource-name>"
command: "get"
parameters:
resourceId: "<resource-id>"
Using CLI
# Check specific resource health
az resource show --ids RESOURCE_ID
# Check recent activity
az monitor activity-log list -g RG --max-events 20
References
Source
git clone https://github.com/microsoft/GitHub-Copilot-for-Azure/blob/main/plugin/skills/azure-diagnostics/SKILL.mdView on GitHub Overview
Official guide to diagnose and troubleshoot Azure production issues across Container Apps and Function Apps. It covers log analysis with KQL, health checks, and common resolutions for image pulls, cold starts, health probes, and invocation failures. The workflow emphasizes a systematic diagnosis, checking resource health first, and documenting findings and remediation steps.
How This Skill Works
Begin with a systematic diagnosis flow: identify symptoms, check resource health, review logs, analyze metrics, and assess recent changes. Use AppLens (MCP) for AI-powered diagnostics when available, verify health before digging into logs, then follow the service-specific troubleshooting guides for Container Apps or Function Apps. Document findings and remediation steps.
When to Use It
- Debug or troubleshoot production Azure issues
- Diagnose image pull failures in Container Apps
- Investigate cold starts and health probe failures
- Analyze logs with KQL to find root cause
- Resolve Azure Function/App invocation failures
Quick Start
- Step 1: Identify symptoms and collect basic telemetry
- Step 2: Check resource health and review relevant logs
- Step 3: Follow the container or function-specific guide and document results
Best Practices
- Follow the Quick Diagnosis Flow: identify symptoms, check resource health, review logs, analyze metrics, and review changes
- Use AppLens (MCP) for AI-powered diagnostics when available
- Always check resource health before diving into logs
- Choose the service-specific troubleshooting guide (Container Apps or Function Apps) to stay focused
- Document findings and remediation steps for traceability
Example Use Cases
- Determine why a Function App invocation fails and identify a missing app setting
- Resolve image pull failures in Container Apps due to registry authentication
- Investigate cold start delays in Functions and adjust startup configuration
- Diagnose health probe failures causing container restarts
- Use KQL to surface application logs and pinpoint root cause