Get the FREE Ultimate OpenClaw Setup Guide →

mcp-local-llm

MCP server for delegating mechanical tasks to local LLMs via Ollama. Claude does the thinking, your local model does the grunt work.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio aplaceforallmystuff-mcp-local-llm node /path/to/mcp-local-llm/dist/index.js \
  --env LOCAL_LLM_MODEL="qwen2.5-coder:7b" \
  --env LOCAL_LLM_BASE_URL="http://localhost:11434/v1" \
  --env LOCAL_LLM_MAX_TOKENS="2048" \
  --env LOCAL_LLM_TEMPERATURE="0.7"

How to use

mcp-local-llm provides a local, cost-optimized layer for delegating mechanical or bulk text tasks to a local large language model backend. Claude Code remains the decision-maker for what to delegate, while the local model handles high-volume work such as summarization, classification, extraction, and drafting. The server exposes tools that you can call from Claude Code (or any MCP-compatible client): local_summarize for bulk text summarization, local_draft for initial content generation, local_classify for tagging and sorting, local_extract for structured data extraction, local_transform for formatting and style changes, local_complete for raw completions, and local_status to verify connectivity and available models. This separation helps reduce API usage costs while keeping Claude in charge of quality control and decision-making.

How to install

Prerequisites:

  1. Ollama installed and running (local LLM backend).
  2. Node.js 18+ installed.
  3. Claude Code (or any MCP-compatible client).

Installation steps:

  1. Install Ollama and pull a model (example):
    • brew install ollama
    • ollama serve
    • ollama pull qwen2.5-coder:7b
  2. Clone the repository and install dependencies:
  3. Build the project:
    • npm run build
  4. Run the MCP server (example):
    • node dist/index.js
  5. Add the MCP server to Claude Code or your MCP client:
    • In Claude Code: claude mcp add local-llm -s user -- node /path/to/mcp-local-llm/dist/index.js
    • Or edit ~/.claude.json to include: { "mcpServers": { "local-llm": { "command": "node", "args": ["/path/to/mcp-local-llm/dist/index.js"] } } }
  6. Verify connectivity:
    • Use the local_status tool in Claude Code to confirm Ollama connection and available models.

Additional notes

Environment variables are optional and default values work with a standard Ollama setup. If you prefer a Docker-based OpenAI-compatible backend, you can use the Docker Model Runner and point LOCAL_LLM_BASE_URL to the TCP endpoint. Common issues include ensuring Ollama is running (ollama list), matching the model name with LOCAL_LLM_MODEL, and adjusting LOCAL_LLM_MAX_TOKENS or LOCAL_LLM_TEMPERATURE to fit your workload. The Delegation Philosophy section explains what tasks are best suited for local delegation versus Claude's core capabilities.

Related MCP Servers

Sponsor this space

Reach thousands of developers