open-responses
Wraps any OpenAI API interface as Responses with MCPs support so it supports Codex. Adding any missing stateful features. Ollama and Vllm compliant.
claude mcp add --transport stdio teabranch-open-responses-server uvx open-responses-server
How to use
open-responses-server is a plug-and-play MCP server that bridges any AI backend (such as Ollama, vLLM, Groq, or OpenAI itself) to OpenAI’s Responses API and ChatCompletions interface. It handles stateful chat, tool calls, and can be extended to support file search, code interpretation, and other hosted tools behind a familiar OpenAI API. This lets you run Codex-like or other OpenAI API clients against your own models or backends, while maintaining compatibility with the OpenAI API surface.
To use it, first ensure your backend is reachable at OPENAI_BASE_URL_INTERNAL and that this server exposes its own API at OPENAI_BASE_URL. You’ll typically provide a mock or real API key via OPENAI_API_KEY. The MCP layer (configured via MCP_SERVERS_CONFIG_PATH) routes incoming requests to your chosen backend, enabling you to test and deploy OpenAI-compatible clients against self-hosted models. You can customize logging, host binding, and ports using the provided environment variables, and you can configure the MCP servers to point at different backends as needed.
How to install
Prerequisites:
- Python 3.8+ and pip
- uv (Python virtual environment tool) or your preferred Python environment
- Access to PyPI or the repository source for open-responses-server
Install from PyPI:
pip install open-responses-server
Install from source (development workflow):
pip install uv
uv venv
uv pip install .
uv pip install -e ".[dev]" # dev dependencies
Run the server (examples shown in the README):
# Using CLI tool (after installation)
otc start
# Or directly from source
uv run src/open_responses_server/cli.py start
Docker deployment (example):
docker run -p 8080:8080 \
-e OPENAI_BASE_URL_INTERNAL=http://your-llm-api:8000 \
-e OPENAI_BASE_URL=http://localhost:8080 \
-e OPENAI_API_KEY=your-api-key \
ghcr.io/teabranch/open-responses-server:latest
Configure and verify:
OPENAI_BASE_URL_INTERNAL=http://localhost:11434
OPENAI_BASE_URL=http://localhost:8080
OPENAI_API_KEY=sk-mockapikey123456789
MCP_SERVERS_CONFIG_PATH=./mcps.json
API_ADAPTER_HOST=0.0.0.0
API_ADAPTER_PORT=8080
Verify:
curl http://localhost:8080/v1/models
Additional notes
Tips and notes:
- This project acts as a bridge layer, so you can point different backends (Ollama, vLLM, Groq, etc.) behind the OpenAI API surface.
- The MCP_SERVERS_CONFIG_PATH environment variable controls where the server reads its MCP configuration; you can define multiple endpoints backing different backends.
- If using Docker, ensure the internal base URL (OPENAI_BASE_URL_INTERNAL) points to your LLM runner and OPENAI_BASE_URL points to this adapter.
- Logs can be controlled with LOG_LEVEL and LOG_FILE_PATH.
- For development, the server supports an interactive configuration via otc configure (as noted in the README).
- This project is Python-based; there is no npm package requirement (npm_package is null).
Related MCP Servers
MCP-Bridge
A middleware to provide an openAI compatible endpoint that can call MCP tools
mcp-client-for-ollama
A text-based user interface (TUI) client for interacting with MCP servers using Ollama. Features include agent mode, multi-server, model switching, streaming responses, tool management, human-in-the-loop, thinking mode, model params config, MCP prompts, custom system prompt and saved preferences. Built for developers working with local LLMs.
mcp-gateway
A plugin-based gateway that orchestrates other MCPs and allows developers to build upon it enterprise-grade agents.
mcp-client-go
mcp client for Go (Golang). Integrate multiple Model Context Protocol (MCP) servers
zerodha
Zerodha MCP Server & Client - AI Agent (w/Agno & w/Google ADK)
openai -agent-dotnet
Sample to create an AI Agent using OpenAI models with any MCP server running on Azure Container Apps