Get the FREE Ultimate OpenClaw Setup Guide →

langfuse-integration

Scanned
npx machina-cli add skill a5c-ai/babysitter/langfuse-integration --openclaw
Files (1)
SKILL.md
1.2 KB

LangFuse Integration Skill

Capabilities

  • Set up LangFuse tracing for LLM calls
  • Configure cost tracking and analytics
  • Implement prompt management
  • Set up evaluation datasets
  • Design custom trace metadata
  • Create dashboards and alerts

Target Processes

  • llm-observability-monitoring
  • cost-optimization-llm

Implementation Details

Core Features

  1. Tracing: Track LLM calls, chains, and agents
  2. Prompts: Version and manage prompts
  3. Analytics: Usage, latency, cost metrics
  4. Datasets: Evaluation and testing data
  5. Scores: Track output quality

Integration Methods

  • LangChain callback handler
  • Direct SDK integration
  • OpenAI drop-in replacement
  • Decorator-based tracing

Configuration Options

  • Public/secret keys
  • Host URL (cloud or self-hosted)
  • Sampling rate
  • Metadata configuration
  • User tracking

Best Practices

  • Consistent trace naming
  • Meaningful metadata
  • Regular prompt versioning
  • Set up alerting

Dependencies

  • langfuse
  • langchain (for callback integration)

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/ai-agents-conversational/skills/langfuse-integration/SKILL.mdView on GitHub

Overview

LangFuse Integration provides end-to-end observability for LLM workflows by tracing calls across models, chains, and agents, while capturing analytics and cost data. It supports prompt versioning, custom trace metadata, evaluation datasets, dashboards, and alerts, helping you optimize performance and spend. The integration can be added via LangChain callbacks, the LangFuse SDK, an OpenAI drop-in, or decorator-based tracing.

How This Skill Works

Install LangFuse, configure keys and host URL, sampling rate, and metadata, then enable tracing in your LLM pipeline. Wire the integration to your flow using a LangChain callback handler, the direct LangFuse SDK, an OpenAI drop-in replacement, or a decorator. LangFuse collects usage, latency, cost metrics, evaluation data, and prompt metadata to power dashboards and alerts.

When to Use It

  • When you need end-to-end tracing of LLM calls, chains, and agents across your apps
  • When you must monitor usage, latency, and costs to optimize budgets
  • When you manage prompts and want to version and track them alongside traces
  • When you want evaluation datasets and scores to measure output quality over time
  • When you need dashboards and alerts to detect anomalies and trigger actions

Quick Start

  1. Step 1: Install LangFuse packages and configure PUBLIC/SECRET keys, host URL, sampling rate, metadata, and user tracking
  2. Step 2: Choose an integration method (LangChain callback, direct SDK, OpenAI drop-in, or decorator) and wire it into your LLM flow
  3. Step 3: Validate traces and dashboards, version prompts, and ensure alerts are firing as expected

Best Practices

  • Use consistent trace naming across LLM calls, chains, and agents
  • Attach meaningful, structured metadata to each trace
  • Regularly version prompts and link versions to traces
  • Set up alerting for latency, error rates, and cost anomalies
  • Test changes in staging with evaluation datasets before production

Example Use Cases

  • Diagnose latency bottlenecks by cross-referencing traces for LLM calls, chains, and prompts
  • Compare two prompts via versioning and analytics to improve response quality and cost
  • Run evaluation datasets to track scores and detect drift in model outputs
  • Build dashboards showing per-team cost and usage with alerts on spikes
  • Retrofit existing apps with OpenAI drop-in tracing to observe performance without major refactors

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers