azure-aigateway
Scannednpx machina-cli add skill microsoft/GitHub-Copilot-for-Azure/azure-aigateway --openclawAzure AI Gateway
Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.
To deploy APIM, use the azure-prepare skill. See APIM deployment guide.
When to Use This Skill
| Category | Triggers |
|---|---|
| Model Governance | "semantic caching", "token limits", "load balance AI", "track token usage" |
| Tool Governance | "rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP" |
| Agent Governance | "content safety", "jailbreak detection", "filter harmful content" |
| Configuration | "add Azure OpenAI backend", "configure my model", "add AI Foundry model" |
| Testing | "test AI gateway", "call OpenAI through gateway" |
Quick Reference
| Policy | Purpose | Details |
|---|---|---|
azure-openai-token-limit | Cost control | Model Policies |
azure-openai-semantic-cache-lookup/store | 60-80% cost savings | Model Policies |
azure-openai-emit-token-metric | Observability | Model Policies |
llm-content-safety | Safety & compliance | Agent Policies |
rate-limit-by-key | MCP/tool protection | Tool Policies |
Get Gateway Details
# Get gateway URL
az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv
# List backends (AI models)
az apim backend list --service-name <apim-name> --resource-group <rg> \
--query "[].{id:name, url:url}" -o table
# Get subscription key
az apim subscription keys list \
--service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>
Test AI Endpoint
GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)
curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <key>" \
-d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
Common Tasks
Add AI Backend
See references/patterns.md for full steps.
# Discover AI resources
az cognitiveservices account list --query "[?kind=='OpenAI']" -o table
# Create backend
az apim backend create --service-name <apim> --resource-group <rg> \
--backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"
# Grant access (managed identity)
az role assignment create --assignee <apim-principal-id> \
--role "Cognitive Services User" --scope <aoai-resource-id>
Apply AI Governance Policy
Recommended policy order in <inbound>:
- Authentication - Managed identity to backend
- Semantic Cache Lookup - Check cache before calling AI
- Token Limits - Cost control
- Content Safety - Filter harmful content
- Backend Selection - Load balancing
- Metrics - Token usage tracking
See references/policies.md for complete example.
Troubleshooting
| Issue | Solution |
|---|---|
| Token limit 429 | Increase tokens-per-minute or add load balancing |
| No cache hits | Lower score-threshold to 0.7 |
| Content false positives | Increase category thresholds (5-6) |
| Backend auth 401 | Grant APIM "Cognitive Services User" role |
See references/troubleshooting.md for details.
References
- Detailed Policies - Full policy examples
- Configuration Patterns - Step-by-step patterns
- Troubleshooting - Common issues
- AI-Gateway Samples
- GenAI Gateway Docs
SDK Quick References
- Content Safety: Python | TypeScript
- API Management: Python | .NET
Source
git clone https://github.com/microsoft/GitHub-Copilot-for-Azure/blob/main/plugin/skills/azure-aigateway/SKILL.mdView on GitHub Overview
Configure Azure API Management as an AI Gateway to govern AI models, MCP tools, and agents. This skill enables semantic caching, token limit enforcement, content safety, load balancing, and AI model governance, with easy integration of backends like Azure OpenAI and AI Foundry.
How This Skill Works
APIM serves as the gateway that routes requests to AI model backends, applying inbound policies in a defined order. It relies on policy blocks such as azure-openai-token-limit, azure-openai-semantic-cache-lookup/store, llm-content-safety, and rate limiting to enforce governance and observability while converting APIs to MCP where needed.
When to Use It
- Model Governance: semantic caching, token limits, token usage tracking
- Tool Governance: rate limit MCP, protect tools, convert API to MCP
- Agent Governance: content safety, jailbreak detection, filter harmful content
- Configuration: add Azure OpenAI backend, configure my model, add AI Foundry model
- Testing: test AI gateway, call OpenAI through gateway
Quick Start
- Step 1: Prepare APIM instance and prerequisites (azure-prepare for deployment, az cli installed).
- Step 2: Add an AI backend (az apim backend create --service-name <apim> ... --url https://<aoai>.openai.azure.com/openai) and grant access.
- Step 3: Apply governance policies (inbound order: Authentication, Semantic Cache Lookup, Token Limits, Content Safety, Backend Selection) and test using the gateway URL with curl.
Best Practices
- Authenticate requests with a managed identity to backends before inbound processing
- Apply Semantic Cache Lookup early to reduce backend calls and cost
- Enforce Token Limits to control latency and spend
- Layer Content Safety (llm-content-safety) before responses
- Plan Backend Selection and Load Balancing to distribute traffic
Example Use Cases
- Govern an OpenAI deployment behind APIM, leveraging semantic caching to save 60-80% on model calls
- Protect MCP tools with rate limiting using rate-limit-by-key policies
- Filter jailbreak attempts with llm-content-safety and monitor token metrics with azure-openai-emit-token-metric
- Add an Azure OpenAI backend and grant access to the APIM via a managed identity
- Test the AI gateway by calling OpenAI through the gateway URL and inspecting responses