lm

Offload Tasks from Claude to your Local LLM With Houtini-LM - uses OpenAPI for LM Studio and Ollama Compatibility. Save tokens by offloading some grunt work for your API - our tool description helps claude decide what work to assign and why.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

View docs

Command

claude mcp add --transport stdio houtini-ai-lm npx -y @houtini/lm \
  --env LM_STUDIO_URL="http://localhost:1234"

How to use

This MCP server, named houtini-lm, enables Claude Code to delegate routine, bounded tasks to a local LLM while Claude handles higher-level orchestration. The local model can perform boilerplate generation, code explanations, test stubs, mock data, type definitions, and format conversions, with Claude handling planning, multi-file changes, and orchestration. The available tools and interfaces include the chat workflow for tasks, the custom_prompt and code_task configurations for structured prompts, and the discover/list_models endpoints to inspect the local model setup. By pointing Claude to a locally hosted model server (e.g., via LM Studio or any supported OpenAI-compatible API), houtini-lm offloads repetitive or well-defined work, reducing token costs and keeping sensitive data on your network. The integration is designed so Claude can kick off the local model for specific subtasks and then incorporate the results back into larger tasks.

How to install

Prerequisites:

Node.js and npm/yarn installed on your machine
Access to a compatible local LLM server (e.g., LM Studio, Ollama, vLLM, or a locally hosted OpenAI-compatible API)

Step 1: Install the MCP package for houtini-lm

If you already have Claude MCP tooling installed, run: claude mcp add houtini-lm -- npx -y @houtini/lm
Alternatively, install the package name if you manage MCPs manually: npm install -g @houtini/lm

Step 2: Run the MCP server via npx (auto-installs the package)

The recommended run integrates with Claude via the provided configuration example:

claude mcp add houtini-lm -e LM_STUDIO_URL=http://localhost:1234 -- npx -y @houtini/lm

Step 3: Verify the local model endpoint

Ensure your local LLM server is running (default LM Studio URL is http://localhost:1234).
If using a different URL, set LM_STUDIO_URL accordingly or adjust the environment to point to the correct endpoint.

Step 4: Optional configuration for desktop or environment

For desktop config, you can mirror this in claude_desktop_config.json with the same command/args and env vars.
Example snippet:

{
  "mcpServers": {
    "houtini-lm": {
      "command": "npx",
      "args": ["-y", "@houtini/lm"],
      "env": {
        "LM_STUDIO_URL": "http://localhost:1234"
      }
    }
  }
}

Additional notes

Tips and caveats:

Default LM_STUDIO_URL is http://localhost:1234; override with LM_STUDIO_URL if your local model is hosted elsewhere.
This MCP server is designed for bounded tasks that don’t require deep reasoning or multi-step tool use; reserve Claude’s orchestration for higher-level planning.
The Tools section (chat, custom_prompt, code_task, discover, list_models) helps you structure prompts and inspect the local model payloads and capabilities.
Streaming is used for inference; if the generation takes longer, you may receive partial results with a TRUNCATED footer.
If you encounter token mismatches or context window issues, adjust LM_CONTEXT_WINDOW and LM_STUDIO_MODEL env vars as needed.
Ensure network access between Claude, the MCP runner, and your local LLM server is allowed (firewalls, localhost bindings, and port exposure).

Related MCP Servers

gemini-kit

253

🚀 19 AI Agents + 44 Commands for Gemini CLI - Code 10x faster with auto planning, testing, review & security

automagik-genie

210

🧞 Automagik Genie – bootstrap, update, and roll back AI agent workspaces with a single CLI + MCP toolkit.

mcp-python-interpreter

MCP Python Interpreter: run python code. Python-mcp-server, mcp-python-server, Code Executor

ollama -bridge

Extend the Ollama API with dynamic AI tool integration from multiple MCP (Model Context Protocol) servers. Fully compatible, transparent, and developer-friendly, ideal for building powerful local LLM applications, AI agents, and custom chatbots

mcpcat-python-sdk

MCPcat is an analytics platform for MCP server owners 🐱.

mcpresso

TypeScript framework to build robust, agent-ready MCP servers around your APIs.