langsmith-tracing
npx machina-cli add skill a5c-ai/babysitter/langsmith-tracing --openclawlangsmith-tracing
Configure LangSmith observability and tracing for LLM applications built with LangChain and LangGraph frameworks.
Overview
LangSmith is the managed observability suite by LangChain that provides:
- Dashboards and alerting for LLM applications
- Human-in-the-loop evaluation capabilities
- Deep LangChain/LangGraph integration
- Run Tree model for nested traces
- MCP connectivity to Claude, VSCode
Capabilities
Core Tracing Setup
- Initialize LangSmith client and API configuration
- Configure project/workspace settings
- Set up trace collection and sampling
- Enable debug logging for agent execution
Integration Patterns
- LangChain chain tracing with automatic instrumentation
- LangGraph workflow state tracking
- Custom span creation for non-LangChain code
- Parent-child trace relationships
Debugging Features
- Fetch execution traces for analysis
- Query run history and metadata
- Export traces for offline analysis
- Compare runs across different versions
Usage
Environment Setup
# Set required environment variables
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<project-name>
Python Integration
from langsmith import Client, traceable
from langchain.callbacks.tracers import LangChainTracer
# Initialize client
client = Client()
# Use @traceable decorator for custom functions
@traceable(name="custom_operation")
def my_function(input_data):
# Your logic here
return result
# Initialize tracer for LangChain
tracer = LangChainTracer(project_name="my-project")
# Use with LangChain chains
chain.invoke(input, config={"callbacks": [tracer]})
Trace Retrieval
# Fetch traces from LangSmith
runs = client.list_runs(
project_name="my-project",
start_time=datetime.now() - timedelta(hours=24),
execution_order=1, # Root runs only
error=False, # Successful runs only
)
for run in runs:
print(f"Run ID: {run.id}")
print(f"Latency: {run.latency_p99}")
print(f"Tokens: {run.total_tokens}")
Task Definition
When used in a babysitter process, this skill produces:
const langsmithTracingTask = defineTask({
name: 'langsmith-tracing-setup',
description: 'Configure LangSmith tracing for the application',
inputs: {
projectName: { type: 'string', required: true },
apiKeyEnvVar: { type: 'string', default: 'LANGCHAIN_API_KEY' },
samplingRate: { type: 'number', default: 1.0 },
enableDebug: { type: 'boolean', default: false }
},
outputs: {
configured: { type: 'boolean' },
projectUrl: { type: 'string' },
artifacts: { type: 'array' }
},
async run(inputs, taskCtx) {
return {
kind: 'skill',
title: `Configure LangSmith tracing for ${inputs.projectName}`,
skill: {
name: 'langsmith-tracing',
context: {
projectName: inputs.projectName,
apiKeyEnvVar: inputs.apiKeyEnvVar,
samplingRate: inputs.samplingRate,
enableDebug: inputs.enableDebug,
instructions: [
'Verify LangSmith API credentials are available',
'Create or validate project configuration',
'Set up tracing instrumentation in codebase',
'Configure sampling rate and debug settings',
'Verify traces are being captured correctly'
]
}
},
io: {
inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
}
};
}
});
Applicable Processes
- llm-observability-monitoring
- agent-evaluation-framework
- react-agent-implementation
- conversation-quality-testing
- regression-testing-agent
External Dependencies
- LangSmith account and API key
- LangChain Python library
- langsmith Python package
References
Related Skills
- SK-OBS-002 langfuse-integration
- SK-OBS-003 phoenix-arize-setup
- SK-OBS-004 opentelemetry-llm
Related Agents
- AG-OPS-004 observability-engineer
- AG-SAF-004 agent-evaluator
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/langsmith-tracing/SKILL.mdView on GitHub Overview
LangSmith tracing provides observability and debugging for LLM apps built with LangChain and LangGraph. It enables trace collection, dashboards, and run histories to monitor and diagnose agent execution. The setup supports automatic instrumentation, custom spans, and debugging across LangChain/LangGraph pipelines.
How This Skill Works
Initialize a LangSmith client with your API key and project, enable tracing via environment variables, and instrument LangChain with a LangChainTracer. Use the @traceable decorator for custom functions and attach the tracer to your LangChain chains so traces, spans, and run metadata are captured for analysis and debugging.
When to Use It
- You need end-to-end observability for LangChain chains and LangGraph workflows.
- You want to debug agent behavior using execution traces, latency, and token usage.
- You are maintaining multiple model or code versions and need to compare runs.
- You need to instrument non-LangChain code with custom spans to capture relevant context.
- You want to fetch, export, or analyze traces offline or across environments.
Quick Start
- Step 1: Set environment variables: export LANGCHAIN_TRACING_V2=true; export LANGCHAIN_API_KEY=<your-api-key>; export LANGCHAIN_PROJECT=<project-name>
- Step 2: Initialize client and tracer in Python: from langsmith import Client, traceable; from langchain.callbacks.tracers import LangChainTracer; client = Client(); @traceable(name="custom_operation") def my_function(input_data): return result; tracer = LangChainTracer(project_name="my-project")
- Step 3: Attach tracer to chains and retrieve traces: chain.invoke(input, config={"callbacks": [tracer]}); runs = client.list_runs(project_name="my-project", start_time=datetime.now() - timedelta(hours=24), execution_order=1, error=False); for run in runs: print(run.id, run.latency_p99, run.total_tokens)
Best Practices
- Set LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, and LANGCHAIN_PROJECT to enable and scope tracing.
- Use LangChainTracer and the @traceable decorator to instrument custom functions and chain components.
- Instrument non-LangChain code with custom spans and maintain parent-child relationships for accurate traces.
- Regularly query and compare runs (latency, tokens, metadata) across versions to catch regressions.
- Export traces for offline analysis and integrate trace data into CI/CD or monitoring dashboards.
Example Use Cases
- Debugging a LangChain chain by inspecting trace latency and token counts to identify bottlenecks.
- Comparing two model versions by reviewing run latency and metadata to choose the better performing version.
- Instrumenting a custom data processing step with @traceable to surface its impact in traces.
- Viewing run history in LangSmith dashboards to monitor agent behavior over time.
- Exporting traces to an offline analysis tool to correlate with user feedback and QA findings.