How do I deploy or install APIM for this gateway?

This skill configures APIM as an AI Gateway; to deploy APIM itself, use the azure-prepare skill as documented in the APIM deployment guide.

Which policies are used for governance and observability?

Key policies include azure-openai-token-limit (cost control), azure-openai-semantic-cache-lookup/store (cache), azure-openai-emit-token-metric (token metrics), llm-content-safety (safety), and rate-limit-by-key (MCP/tool protection).

How do I test the AI gateway?

Retrieve the gateway URL with az apim show, then curl to the OpenAI endpoint path through the gateway and verify responses and token usage.

azure-aigateway

Scanned

npx machina-cli add skill microsoft/GitHub-Copilot-for-Azure/azure-aigateway --openclaw

Files (1)

SKILL.md

5.1 KB

Azure AI Gateway

Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.

To deploy APIM, use the azure-prepare skill. See APIM deployment guide.

When to Use This Skill

Category	Triggers
Model Governance	"semantic caching", "token limits", "load balance AI", "track token usage"
Tool Governance	"rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP"
Agent Governance	"content safety", "jailbreak detection", "filter harmful content"
Configuration	"add Azure OpenAI backend", "configure my model", "add AI Foundry model"
Testing	"test AI gateway", "call OpenAI through gateway"

Quick Reference

Policy	Purpose	Details
`azure-openai-token-limit`	Cost control	Model Policies
`azure-openai-semantic-cache-lookup/store`	60-80% cost savings	Model Policies
`azure-openai-emit-token-metric`	Observability	Model Policies
`llm-content-safety`	Safety & compliance	Agent Policies
`rate-limit-by-key`	MCP/tool protection	Tool Policies

Get Gateway Details

# Get gateway URL
az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv

# List backends (AI models)
az apim backend list --service-name <apim-name> --resource-group <rg> \
  --query "[].{id:name, url:url}" -o table

# Get subscription key
az apim subscription keys list \
  --service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>

Test AI Endpoint

GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)

curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
  -H "Content-Type: application/json" \
  -H "Ocp-Apim-Subscription-Key: <key>" \
  -d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'

Common Tasks

Add AI Backend

See references/patterns.md for full steps.

# Discover AI resources
az cognitiveservices account list --query "[?kind=='OpenAI']" -o table

# Create backend
az apim backend create --service-name <apim> --resource-group <rg> \
  --backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"

# Grant access (managed identity)
az role assignment create --assignee <apim-principal-id> \
  --role "Cognitive Services User" --scope <aoai-resource-id>

Apply AI Governance Policy

Recommended policy order in <inbound>:

Authentication - Managed identity to backend
Semantic Cache Lookup - Check cache before calling AI
Token Limits - Cost control
Content Safety - Filter harmful content
Backend Selection - Load balancing
Metrics - Token usage tracking

See references/policies.md for complete example.

Troubleshooting

Issue	Solution
Token limit 429	Increase `tokens-per-minute` or add load balancing
No cache hits	Lower `score-threshold` to 0.7
Content false positives	Increase category thresholds (5-6)
Backend auth 401	Grant APIM "Cognitive Services User" role

See references/troubleshooting.md for details.

References

Detailed Policies - Full policy examples
Configuration Patterns - Step-by-step patterns
Troubleshooting - Common issues
AI-Gateway Samples
GenAI Gateway Docs

SDK Quick References

Content Safety: Python | TypeScript
API Management: Python | .NET

Source

git clone https://github.com/microsoft/GitHub-Copilot-for-Azure/blob/main/plugin/skills/azure-aigateway/SKILL.mdView on GitHub

Overview

Configure Azure API Management as an AI Gateway to govern AI models, MCP tools, and agents. This skill enables semantic caching, token limit enforcement, content safety, load balancing, and AI model governance, with easy integration of backends like Azure OpenAI and AI Foundry.

How This Skill Works

APIM serves as the gateway that routes requests to AI model backends, applying inbound policies in a defined order. It relies on policy blocks such as azure-openai-token-limit, azure-openai-semantic-cache-lookup/store, llm-content-safety, and rate limiting to enforce governance and observability while converting APIs to MCP where needed.

When to Use It

Model Governance: semantic caching, token limits, token usage tracking
Tool Governance: rate limit MCP, protect tools, convert API to MCP
Agent Governance: content safety, jailbreak detection, filter harmful content
Configuration: add Azure OpenAI backend, configure my model, add AI Foundry model
Testing: test AI gateway, call OpenAI through gateway

Quick Start

Step 1: Prepare APIM instance and prerequisites (azure-prepare for deployment, az cli installed).
Step 2: Add an AI backend (az apim backend create --service-name <apim> ... --url https://<aoai>.openai.azure.com/openai) and grant access.
Step 3: Apply governance policies (inbound order: Authentication, Semantic Cache Lookup, Token Limits, Content Safety, Backend Selection) and test using the gateway URL with curl.

Best Practices

Authenticate requests with a managed identity to backends before inbound processing
Apply Semantic Cache Lookup early to reduce backend calls and cost
Enforce Token Limits to control latency and spend
Layer Content Safety (llm-content-safety) before responses
Plan Backend Selection and Load Balancing to distribute traffic

Example Use Cases

Govern an OpenAI deployment behind APIM, leveraging semantic caching to save 60-80% on model calls
Protect MCP tools with rate limiting using rate-limit-by-key policies
Filter jailbreak attempts with llm-content-safety and monitor token metrics with azure-openai-emit-token-metric
Add an Azure OpenAI backend and grant access to the APIM via a managed identity
Test the AI gateway by calling OpenAI through the gateway URL and inspecting responses

Frequently Asked Questions

Add this skill to your agents