Get the FREE Ultimate OpenClaw Setup Guide →

facetron

FaceTron is a high-performance face embedding server using ONNX Runtime, supporting dynamic multi-model loading, offline deployment, and scalable environments. It exposes an OpenAPI endpoint with MCP-compatible metadata and integrates with OpenTelemetry for observability.

Installation
Run this command in your terminal to add the MCP server to Claude Code.
Run in terminal:
Command
claude mcp add --transport stdio 13shivam-facetron python main.py \
  --env LOG_LEVEL="info" \
  --env MODEL_DIR="models directory path containing ONNX models (default: ./facetron/models)" \
  --env DISABLE_OTEL="false" \
  --env OTEL_EXPORTER_OTLP_ENDPOINT="http://host.docker.internal:4317"

How to use

Facetron is a FastAPI-based MCP server that serves face embedding models from ONNX files. It supports dynamic multi-model loading, enabling multiple ONNX models (e.g., ArcFace, SCRFD, Glint360K) to be loaded and queried via a single API surface. Key endpoints include GET /models to list loaded models, POST /infer to compute embeddings for detected faces in an image, POST /infer_visualize to return an image with bounding boxes and aligned faces, GET /download to retrieve the annotated image, and GET /openapi.json to fetch the MCP-compatible metadata specification. The server also includes built-in OpenTelemetry tracing (configurable via environment variables) and exposes MCP metadata in its OpenAPI spec for agent integration. To get started, run the server, then use the tester script or curl commands to perform inference and visualize results against the loaded models.

How to install

Prerequisites:

  • Python 3.9 or later
  • pip (Python package manager)
  • Optional: Docker and docker-compose for containerized runs

Install and run (local development):

  1. Clone the repository and navigate into it
git clone https://github.com/13shivam/facetron.git
cd facetron
  1. Install dependencies
pip install -r requirements.txt
  1. Ensure ONNX models are present
  • Place your ONNX models under the models/ directory. Each model should expose a wrapper interface compatible with get_embedding(np.ndarray) -> np.ndarray.
  1. Run the server
python main.py

Optional containerized run (docker-compose):

# Ensure .env is configured as needed, then start
docker-compose up -d

Access API docs and test endpoints via:

Additional notes

Tips and notes:

  • Environment variables: MODEL_DIR, LOG_LEVEL, DISABLE_OTEL, and OTEL_EXPORTER_OTLP_ENDPOINT control model loading, logging, and telemetry. Set DISABLE_OTEL=true to disable tracing.
  • If you replace or mount a custom models directory, ensure models implement a wrapper exposing get_embedding(np.ndarray) -> np.ndarray so the registry can call into them.
  • To load multiple models, place them in the models/ directory and access them via the /models API endpoint.
  • If using Docker, ensure the volume mapping includes your models directory or rebuild the image with updated models.
  • OpenTelemetry integration exports traces to OTLP endpoints (Jaeger-compatible). Verify your OTEL_EXPORTER_OTLP_ENDPOINT and firewall settings.
  • The API supports downloading annotated images via /download and returning visualized in /infer_visualize for quick validation.
  • When upgrading Python dependencies, re-run pip install -r requirements.txt to refresh packages.

Related MCP Servers

Sponsor this space

Reach thousands of developers