What is the opentelemetry-llm skill?

It provides OpenTelemetry instrumentation for LLM apps with distributed tracing, semantic conventions, and exporters.

Which exporters are supported?

Jaeger, OTLP, and Console exporters.

How do I propagate context across services?

Configure OpenTelemetry context propagation in the TracerProvider and propagate the trace context through service boundaries to maintain end-to-end traces.

opentelemetry-llm

Scanned

npx machina-cli add skill a5c-ai/babysitter/opentelemetry-llm --openclaw

Files (1)

SKILL.md

1.3 KB

OpenTelemetry LLM Skill

Capabilities

Configure OpenTelemetry SDK for LLM apps
Implement LLM-specific instrumentation
Set up trace exporters (Jaeger, OTLP)
Design semantic conventions for LLM
Configure span attributes for AI workloads
Implement context propagation

Target Processes

llm-observability-monitoring
agent-deployment-pipeline

Implementation Details

Core Components

TracerProvider: SDK configuration
SpanProcessor: Batch/simple processors
Exporters: Jaeger, OTLP, Console
Instrumentation: Auto and manual

LLM Semantic Conventions

gen_ai.system (OpenAI, Anthropic)
gen_ai.request.model
gen_ai.request.max_tokens
gen_ai.response.finish_reason
gen_ai.usage.prompt_tokens

Configuration Options

Exporter selection
Sampling strategies
Resource attributes
Span limits
Context propagation

Best Practices

Consistent attribute naming
Appropriate sampling
Error handling traces
Propagate context across services

Dependencies

opentelemetry-sdk
opentelemetry-exporter-*
openinference (optional)

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/opentelemetry-llm/SKILL.md

View on GitHub

Overview

This skill enables full observability for LLM-based applications by configuring the OpenTelemetry SDK, applying LLM-specific instrumentation, and exporting traces to Jaeger, OTLP, or the Console. It defines semantic conventions for LLM calls and ensures context propagation across services to improve debugging and performance insights.

How This Skill Works

Set up a TracerProvider and SpanProcessor, attach exporters (Jaeger, OTLP, Console), and enable both auto and manual instrumentation for LLM interactions. Implement LLM semantic conventions using attributes like gen_ai.system, gen_ai.request.model, and gen_ai.usage.prompt_tokens, and propagate context across service boundaries for end-to-end tracing.

When to Use It

Instrument an AI chatbot or agent pipeline to collect end-to-end traces across multiple services
Enforce standardized LLM attributes with gen_ai.* conventions across teams
Deploy in production with Jaeger or OTLP exporters for centralized observability
Debug latency, retries, and errors in AI workloads with detailed traces
Coordinate tracing across orchestrators, workers, and post-processing steps

Quick Start

Step 1: Install opentelemetry-sdk and exporter packages (Jaeger/OTLP/Console)
Step 2: Initialize TracerProvider, add a SpanProcessor, and register the chosen exporter
Step 3: Enable auto/manual LLM instrumentation and apply gen_ai.* attributes; ensure context propagation

Best Practices

Consistent attribute naming for all LLM traces (gen_ai.*)
Appropriate sampling to balance visibility and overhead
Use error-focused traces to surface failures in LLM calls
Propagate context across services to maintain trace continuity
Design and enforce LLM semantic conventions (gen_ai.*) across the stack

Example Use Cases

Instrument a chatbot service with OpenTelemetry and export traces to Jaeger, using gen_ai.request.model and gen_ai.usage tokens
Trace an LLM workflow that flows from orchestrator to worker to post-processor with OTLP exporter
Configure a Kubernetes deployment to send traces to a centralized collector via OTLP
Apply batch span processing for high-throughput LLM traffic while maintaining trace quality
Enforce gen_ai semantic attributes across all LLM calls in a multi-service platform

Frequently Asked Questions

Add this skill to your agents