Get the FREE Ultimate OpenClaw Setup Guide →

lemonade

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio lemonade-sdk-lemonade python -m lemonade_server run Gemma-3-4b-it-GGUF \
  --env LEMONADE_CONFIG="path/to/config.json (optional)"

How to use

Lemonade provides a local inference platform that hosts and serves optimized models (LLMs), image generation, and speech generation directly on your machine. The Lemonade CLI exposes a range of commands to run pre-bundled models, browse and pull models, and manage available backends. Typical workflows include starting the server to access a web UI or API endpoints, listing available models, and pulling specific models for offline use. You can run chat with Gemma or generate images and speech via the built-in interfaces. The CLI supports commands like run to start a specific model, list to view available models, pull to download a model, and recipes to inspect your local backends. This makes it easy to experiment with local, GPU-accelerated inference and integrate Lemonade into other tools and apps.

How to install

Prerequisites:

  • A supported OS (Linux, Windows, macOS) and a modern Python 3.10–3.13 environment
  • Optional: GPU support drivers if you plan to use GPU acceleration

Install steps:

  1. Install Python and pip (if not already installed).
  2. Create a virtual environment (recommended):
    • python -m venv lemonade-venv
    • source lemonade-venv/bin/activate (Linux/macOS)
    • lemonade-venv\Scripts\activate (Windows)
  3. Install the Lemonade server package (example using pipx or pip):
    • Using pip: pip install lemonade-server
    • Or using pipx (isolated env): pipx install lemonade-server
  4. Verify installation:
    • lemonade-server --version
  5. Run Lemonade with a model (example from README):
    • lemonade-server run Gemma-3-4b-it-GGUF
  6. Optional: configure environment variables or supply a config file as needed for your environment.

Additional notes

Tips and notes:

  • The Lemonade CLI can pull models locally via lemonade-server pull <model-id> and list models with lemonade-server list.
  • For offline or repeatable environments, consider using Model Manager to download and cache models ahead of time.
  • If GPU acceleration is used, ensure the appropriate drivers and CUDA toolkit are installed and compatible with your hardware.
  • Environment variables (such as LEMONADE_CONFIG) can be used to customize paths, model caches, or backend settings. Review the docs for available options.
  • If you encounter port or binding issues, check that no other process is occupying the required port and that firewall rules permit local access.

Related MCP Servers

OpenClaw setup in under 5 minutes

Deploy your agent in 1 click. Use KILOPARTNERSMAY for 50% off your first month.