mcp -litellm
MCP server from OpenCnid/mcp-server-litellm
claude mcp add --transport stdio opencnid-mcp-server-litellm python -m mcp_server_litellm \ --env LITELLM_CACHE="Path to cache directory for LiteLLM (optional)" \ --env LITELLM_MODEL="Model to use with LiteLLM (e.g., 'gpt-3.5-turbo')" \ --env OPENAI_API_KEY="OpenAI API key if using OpenAI models via LiteLLM"
How to use
This MCP server integrates LiteLLM to handle text completion requests by routing them through LiteLLM, which can leverage OpenAI models when configured. Once started, the litellm MCP endpoint can be used by clients to request text completions, with LiteLLM managing model selection, batching, and rate limits behind the scenes. Typical usage involves sending a completion prompt to the MCP service and receiving a structured completion response. If you provide an OpenAI API key, LiteLLM can proxy requests to OpenAI models as configured, enabling advanced capabilities like streaming tokens and system prompts. Ensure the environment variables are set for your desired model and API access before starting the server.
How to install
Prerequisites:
- Python 3.8+ installed on your system
- Pip available in your environment
Install the MCP server package:
pip install mcp-server-litellm
Optionally create and activate a virtual environment:
python -m venv venv
# Windows
venv\Scripts\activate.bat
# Unix/macOS
source venv/bin/activate
Configure and run the MCP server (see mcp_config for options):
# Example using the default module invocation via -m
python -m mcp_server_litellm
Ensure required environment variables are set (see mcp_config).
Additional notes
Notes:
- If you intend to use OpenAI models, you must provide OPENAI_API_KEY in the environment. LiteLLM can route requests to OpenAI or operate with local/alternative backends depending on configuration.
- Set LITELLM_MODEL to select the desired model (e.g., a specific GPT-3.5/4 variant) if supported by your LiteLLM installation.
- Monitor resource usage (CPU/GPU, memory) when using large models and enable batching if supported for throughput.
- If you encounter authentication or connectivity issues with the OpenAI API, verify API keys and network access from the host running the MCP server.
- Use the MCP tooling to validate prompts, tune parameters (temperature, max_tokens), and handle streaming responses if supported by LiteLLM.
- Ensure you are using compatible versions of LiteLLM, the MCP server package, and Python dependencies.
Related MCP Servers
PowerShell.MCP
The universal MCP server for Claude Code and other MCP-compatible clients. One installation gives AI access to 10,000+ PowerShell modules and any CLI tool. You and AI collaborate in the same console with full transparency. Supports Windows, Linux, and macOS.
mcp -updater
Automatically analyze and update Model Context Protocol (MCP) servers for Claude Desktop
fabric -claude-extension
Microsoft Fabric MCP Server - Claude Desktop Extension
gemini -manager
A PowerShell script to manage Model Context Protocol (MCP) servers for the Gemini CLI. It allows adding, removing, enabling/disabling, and listing server configurations.
McProminenceServer
MCP server from JohnnieW4lker/McProminenceServer