How do I enable read-only mode?

Use the CLI flag langfuse-mcp --read-only or LANGFUSE_MCP_READ_ONLY=true to disable write tools like create_text_prompt, create_chat_prompt, update_prompt_labels, create_dataset, create_dataset_item, and delete_dataset_item.

What commands should I start with to debug errors?

Start with fetch_traces(age=60) or fetch_traces(age=1440) to locate traces, then drill into find_exceptions and get_exception_details(trace_id) for full context.

How do I verify a successful MCP setup?

Restart the CLI, then verify with /mcp (Claude) or codex mcp list (Codex). Run a quick test like fetch_traces(age=60) to confirm data flow.

langfuse

Scanned

npx machina-cli add skill avivsinai/langfuse-mcp/langfuse --openclaw

Files (1)

SKILL.md

5.2 KB

Langfuse Skill

Debug your AI systems through Langfuse observability.

Triggers: langfuse, traces, debug AI, find exceptions, set up langfuse, what went wrong, why is it slow, datasets, evaluation sets

Setup

Step 1: Get credentials from https://cloud.langfuse.com → Settings → API Keys

If self-hosted, use your instance URL for LANGFUSE_HOST and create keys there.

Step 2: Install MCP (pick one):

# Claude Code (project-scoped, shared via .mcp.json)
claude mcp add \
  --scope project \
  --env LANGFUSE_PUBLIC_KEY=pk-... \
  --env LANGFUSE_SECRET_KEY=sk-... \
  --env LANGFUSE_HOST=https://cloud.langfuse.com \
  langfuse -- uvx --python 3.11 langfuse-mcp

# Codex CLI (user-scoped, stored in ~/.codex/config.toml)
codex mcp add langfuse \
  --env LANGFUSE_PUBLIC_KEY=pk-... \
  --env LANGFUSE_SECRET_KEY=sk-... \
  --env LANGFUSE_HOST=https://cloud.langfuse.com \
  -- uvx --python 3.11 langfuse-mcp

Step 3: Restart CLI, verify with /mcp (Claude) or codex mcp list (Codex)

Step 4: Test: fetch_traces(age=60)

Read-Only Mode

For safer observability without risk of modifying prompts or datasets, enable read-only mode:

# CLI flag
langfuse-mcp --read-only

# Or environment variable
LANGFUSE_MCP_READ_ONLY=true

This disables write tools: create_text_prompt, create_chat_prompt, update_prompt_labels, create_dataset, create_dataset_item, delete_dataset_item.

For manual .mcp.json setup or troubleshooting, see references/setup.md.

Playbooks

"Where are the errors?"

find_exceptions(age=1440, group_by="file")

→ Shows error counts by file. Pick the worst offender.

find_exceptions_in_file(filepath="src/ai/chat.py", age=1440)

→ Lists specific exceptions. Grab a trace_id.

get_exception_details(trace_id="...")

→ Full stacktrace and context.

"What happened in this interaction?"

fetch_traces(age=60, user_id="...")

→ Find the trace. Note the trace_id.

If you don't know the user_id, start with:

fetch_traces(age=60)

fetch_trace(trace_id="...", include_observations=true)

→ See all LLM calls in the trace.

fetch_observation(observation_id="...")

→ Inspect a specific generation's input/output.

"Why is it slow?"

fetch_observations(age=60, type="GENERATION")

→ Find recent LLM calls. Look for high latency.

fetch_observation(observation_id="...")

→ Check token counts, model, timing.

"What's this user experiencing?"

get_user_sessions(user_id="...", age=1440)

→ List their sessions.

get_session_details(session_id="...")

→ See all traces in the session.

"Manage datasets"

list_datasets()

→ See all datasets.

get_dataset(name="evaluation-set-v1")

→ Get dataset details.

list_dataset_items(dataset_name="evaluation-set-v1", page=1, limit=10)

→ Browse items in the dataset.

create_dataset(name="qa-test-cases", description="QA evaluation set")

→ Create a new dataset.

create_dataset_item(
  dataset_name="qa-test-cases",
  input={"question": "What is 2+2?"},
  expected_output={"answer": "4"}
)

→ Add test cases.

create_dataset_item(
  dataset_name="qa-test-cases",
  item_id="item_123",
  input={"question": "What is 3+3?"},
  expected_output={"answer": "6"}
)

→ Upsert: updates existing item by id or creates if missing.

"Manage prompts"

list_prompts()

→ See all prompts with labels.

get_prompt(name="...", label="production")

→ Fetch current production version.

create_text_prompt(name="...", prompt="...", labels=["staging"])

→ Create new version in staging.

update_prompt_labels(name="...", version=N, labels=["production"])

→ Promote to production. (Rollback = re-apply label to older version)

Quick Reference

Task	Tool
List traces	`fetch_traces(age=N)`
Get trace details	`fetch_trace(trace_id="...", include_observations=true)`
List LLM calls	`fetch_observations(age=N, type="GENERATION")`
Get observation	`fetch_observation(observation_id="...")`
Error count	`get_error_count(age=N)`
Find exceptions	`find_exceptions(age=N, group_by="file")`
List sessions	`fetch_sessions(age=N)`
User sessions	`get_user_sessions(user_id="...", age=N)`
List prompts	`list_prompts()`
Get prompt	`get_prompt(name="...", label="production")`
List datasets	`list_datasets()`
Get dataset	`get_dataset(name="...")`
List dataset items	`list_dataset_items(dataset_name="...", limit=N)`
Create/update dataset item	`create_dataset_item(dataset_name="...", item_id="...")`

age = minutes to look back (max 10080 = 7 days)

References

references/tool-reference.md — Full parameter docs, filter semantics, response schemas
references/setup.md — Manual setup, troubleshooting, advanced configuration

Source

git clone https://github.com/avivsinai/langfuse-mcp/blob/main/skills/langfuse/SKILL.mdView on GitHub

Overview

Langfuse skill enables debugging AI traces and observing sessions via Langfuse MCP. It covers tracing, exception analysis, and prompt/dataset management, plus MCP setup and configuration for both Claude Code and Codex CLI workflows.

How This Skill Works

Provide Langfuse API keys and host, install Langfuse MCP (Claude Code or Codex CLI), then restart the CLI. Use the built-in playbooks to fetch traces, locate exceptions, inspect observations, and manage prompts or datasets, with an optional read-only mode to prevent unintended changes.

When to Use It

When you need to locate errors across traces (find_exceptions, get_exception_details).
When you want to inspect a specific trace or observation (fetch_traces, fetch_trace, fetch_observation).
When diagnosing slow responses or high latency (fetch_observations with type=GENERATION and timing details).
When reviewing a user's sessions and interactions (get_user_sessions, get_session_details).
When managing prompts, datasets, or labels in MCP (list_prompts, get_prompt, create_dataset, create_dataset_item).

Quick Start

Step 1: Get credentials from Langfuse Settings → API Keys and set LANGFUSE_HOST if self-hosted.
Step 2: Install MCP using Claude Code or Codex CLI as shown in SKILL.md (env vars for keys and host).
Step 3: Restart the CLI and test with fetch_traces(age=60) to verify setup.

Best Practices

Enable read-only mode during observability to prevent unintended writes.
Verify LANGFUSE_HOST and API keys before running MCP commands.
Start with fetch_traces to locate trace_id, then drill into fetch_trace and fetch_observation.
Use get_session_details to understand a session's flow before making changes.
Test changes in a non-production context and monitor MCP outputs with /mcp or codex mcp list.

Example Use Cases

Use find_exceptions(age=1440, group_by='file') to identify the worst offender by file.
Fetch traces for a user: fetch_traces(age=60, user_id='user-123') and grab trace_id for deeper inspection.
Get a full stacktrace with get_exception_details(trace_id='trace-abc').
Investigate latency with fetch_observations(age=60, type='GENERATION').
Manage prompts: list_prompts(), get_prompt(name='...', label='production').

Frequently Asked Questions

Add this skill to your agents