llama

MCP cerver to let claude use llamacpp. Uses API.

Installation

Run this command in your terminal to add the MCP server to Claude Code.

Run in terminal:

Command

claude mcp add --transport stdio openconstruct-llama-mcp-server node /home/jerr/llama-mcp/dist/index.js

How to use

This MCP server acts as a bridge between Claude Desktop and your local LLM instance exposed via llama-server. It enables LibreModel-style conversations to flow through the MCP protocol, giving you full control over model parameters such as temperature, max_tokens, top_p, and top_k, while providing health checks, testing tools, and usage metrics. To use it, ensure llama-server is running and that Claude Desktop is configured to connect to the MCP server defined in your environment. The server exposes a chat tool for ongoing conversations, a quick_test tool for capability checks, and a health_check tool to monitor status. Once configured, Claude Desktop can send MCP messages to the server, which then translates them into llama-server API calls and returns responses back to Claude Desktop.

How to install

Prerequisites:

Node.js installed (recommended LTS version)
Access to a running llama-server instance (with your model loaded)
Claude Desktop configured to point at the MCP server (per environment setup)

Install the MCP server package

Run in your project directory:

npm install @openconstruct/llama-mcp-server

Build or prepare the server (if you build step is required by the package)

npm run build

Run the llama MCP server

Start the server (adjust path as needed to your built index):

node dist/index.js

Verify llama-server is up and accessible at the configured URL

Ensure llama-server is running with your model, for example:

./llama-server -m lm37.gguf -c 2048 --port 8080

Configure Claude Desktop to use the MCP server

Edit ~/.config/claude/claude_desktop_config.json and ensure you have:

{
  "mcpServers": {
    "libremodel": {
      "command": "node",
      "args": ["/home/jerr/llama-mcp/dist/index.js"]
    }
  }
}

Restart Claude Desktop and test the connection using the built-in tools (chat, quick_test, health_check).

Additional notes

Notes and tips:

Environment variable: LLAMA_SERVER_URL should point to your llama-server API (default http://localhost:8080).
If you see "Cannot reach LLama server", verify that llama-server is running and that the port matches LLAMA_SERVER_URL. Check firewall rules.
After changing configuration, restart Claude Desktop to ensure MCP server changes are picked up.
The npm package name to install is @openconstruct/llama-mcp-server. The server in this setup is configured under the libremodel key; you can rename the MCP server entry if you want multiple instances.
For troubleshooting, ensure the dist/index.js path used in the Node command is the correct built artifact and that permissions allow execution.

Related MCP Servers

zen

1.1k

Selfhosted notes app. Single golang binary, notes stored as markdown within SQLite, full-text search, very low resource usage

MCP -Deepseek_R1

A Model Context Protocol (MCP) server implementation connecting Claude Desktop with DeepSeek's language models (R1/V3)

mcp-fhir

A Model Context Protocol implementation for FHIR

mcp

Inkdrop Model Context Protocol Server

mcp-appium-gestures

This is a Model Context Protocol (MCP) server providing resources and tools for Appium mobile gestures using Actions API..

dubco -npm

The (Unofficial) dubco-mcp-server enables AI assistants to manage Dub.co short links via the Model Context Protocol. It provides three MCP tools: create_link for generating new short URLs, update_link for modifying existing links, and delete_link for removing short links.