Get the FREE Ultimate OpenClaw Setup Guide →

FastAPI-BitNet

Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio grctest-fastapi-bitnet docker run -d --name ai_container -p 8080:8080 fastapi_bitnet

How to use

FastAPI-BitNet provides a REST API to manage and interact with llama.cpp-based BitNet model instances. The server exposes endpoints to start, stop, and monitor multiple persistent bitnet sessions, run batch operations, and perform interactive prompts against running models. It also includes capabilities for running model benchmarks, estimating resource capacity, and integrating with VS Code Copilot via the MCP protocol. Once the server is up, you can explore and test the API through the auto-generated docs available at /docs (Swagger UI) and /redoc (ReDoc). The API is designed to let you programmatically launch and control BitNet instances, issue prompts, collect cleaned responses, and orchestrate multiple sessions in parallel for benchmarking or automated testing workflows.

How to install

Prerequisites:

  • Docker Desktop installed and running
  • Optional Python environment for local development (Python 3.10+)
  • Git to clone the repository (if you are starting from source)

Option A: Run with Docker (recommended)

  1. Build the Docker image (if you have a Dockerfile in the repo):
docker build -t fastapi_bitnet .
  1. Run the container (maps port 8080 from container to host):
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
  1. Verify the server is running by visiting http://localhost:8080/docs or http://localhost:8080/redoc

Option B: Run locally with Uvicorn (for development only)

  1. Create and activate a Python environment (optional but recommended):
conda create -n bitnet python=3.11
conda activate bitnet
  1. Install dependencies (adjust to your project setup):
pip install fastapi uvicorn pydantic
  1. Run the app directly (adjust the module path as needed):
uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload
  1. Access the API docs at http://127.0.0.1:8080/docs

Note: The repository may provide a Dockerfile or a Python package entrypoint. If you have a dedicated npm package or a different startup script, adapt the commands accordingly.

Additional notes

Tips and caveats:

  • The API exposes multiple endpoints to manage BitNet sessions (start, stop, status) and to perform batch operations. Use the API docs to discover exact routes and payload schemas.
  • When running in Docker, ensure that the container has access to the model files and any required GPU or CPU resources as configured in your host environment.
  • If you modify the model directory or need to point to a specific BitNet model, adjust the server configuration or environment variables as documented in the repository.
  • Typical environment variables you might encounter include PORT, MODEL_PATH, and any CLi tool paths required by llama.cpp/llama-cli. If not required, you can omit them or set placeholders until you configure the actual paths.
  • For VS Code Copilot integration, ensure the MCP endpoint is reachable at http://<host>:8080/mcp and that the server is exposing the appropriate API surface.
  • If you run into port conflicts, change the host port mapping (e.g., -p 8081:8080) and access the UI via http://localhost:8081/docs.

Related MCP Servers

Sponsor this space

Reach thousands of developers