Get the FREE Ultimate OpenClaw Setup Guide →

open-responses

Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm compliant.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio teabranch-open-responses-server uvx open-responses-server

How to use

open-responses-server is a plug-and-play MCP server that bridges any AI backend (such as Ollama, vLLM, Groq, or OpenAI itself) to OpenAI’s Responses API and ChatCompletions interface. It handles stateful chat, tool calls, and can be extended to support file search, code interpretation, and other hosted tools behind a familiar OpenAI API. This lets you run Codex-like or other OpenAI API clients against your own models or backends, while maintaining compatibility with the OpenAI API surface.

To use it, first ensure your backend is reachable at OPENAI_BASE_URL_INTERNAL and that this server exposes its own API at OPENAI_BASE_URL. You’ll typically provide a mock or real API key via OPENAI_API_KEY. The MCP layer (configured via MCP_SERVERS_CONFIG_PATH) routes incoming requests to your chosen backend, enabling you to test and deploy OpenAI-compatible clients against self-hosted models. You can customize logging, host binding, and ports using the provided environment variables, and you can configure the MCP servers to point at different backends as needed.

How to install

Prerequisites:

  • Python 3.8+ and pip
  • uv (Python virtual environment tool) or your preferred Python environment
  • Access to PyPI or the repository source for open-responses-server

Install from PyPI:

pip install open-responses-server

Install from source (development workflow):

pip install uv
uv venv
uv pip install .
uv pip install -e ".[dev]"  # dev dependencies

Run the server (examples shown in the README):

# Using CLI tool (after installation)
otc start

# Or directly from source
uv run src/open_responses_server/cli.py start

Docker deployment (example):

docker run -p 8080:8080 \
  -e OPENAI_BASE_URL_INTERNAL=http://your-llm-api:8000 \
  -e OPENAI_BASE_URL=http://localhost:8080 \
  -e OPENAI_API_KEY=your-api-key \
  ghcr.io/teabranch/open-responses-server:latest

Configure and verify:

OPENAI_BASE_URL_INTERNAL=http://localhost:11434
OPENAI_BASE_URL=http://localhost:8080
OPENAI_API_KEY=sk-mockapikey123456789
MCP_SERVERS_CONFIG_PATH=./mcps.json
API_ADAPTER_HOST=0.0.0.0
API_ADAPTER_PORT=8080

Verify:

curl http://localhost:8080/v1/models

Additional notes

Tips and notes:

  • This project acts as a bridge layer, so you can point different backends (Ollama, vLLM, Groq, etc.) behind the OpenAI API surface.
  • The MCP_SERVERS_CONFIG_PATH environment variable controls where the server reads its MCP configuration; you can define multiple endpoints backing different backends.
  • If using Docker, ensure the internal base URL (OPENAI_BASE_URL_INTERNAL) points to your LLM runner and OPENAI_BASE_URL points to this adapter.
  • Logs can be controlled with LOG_LEVEL and LOG_FILE_PATH.
  • For development, the server supports an interactive configuration via otc configure (as noted in the README).
  • This project is Python-based; there is no npm package requirement (npm_package is null).

Related MCP Servers

Sponsor this space

Reach thousands of developers