data-philter
Sovereign AI for Natural Language to SQL (NL2SQL) analytics. A local-first, privacy-focused conversational interface for Apache Druid and ClickHouse using the Model Context Protocol (MCP). Run LLMs locally via Ollama to query enterprise data securely without it ever leaving your infrastructure.
claude mcp add --transport stdio iunera-data-philter docker run -i iunera/data-philter:latest \ --env DRUID_SSL_ENABLED="true|false" \ --env IUNERA_MODEL_TYPE="ollama-m (or your preferred model type)" \ --env DRUID_AUTH_PASSWORD="password" \ --env DRUID_AUTH_USERNAME="username" \ --env SPRING_AI_OPENAI_API_KEY="your-openai-api-key" \ --env DRUID_MCP_READONLY_ENABLED="true|false"
How to use
Data Philter is a local-first MCP-based interface that translates natural language questions into SQL or Druid JSON queries, enabling you to query Apache Druid and ClickHouse without sending data to external services. It runs inside your environment (via Docker) and uses the Model Context Protocol to orchestrate the reasoning, safety, and database execution layers. You can interact with it through its web UI and rely on the MCP translation layer to yield read-only queries by default, ensuring data privacy. Use cases include ad-hoc analytics, time-series exploration, and offline or VPC-bound querying using plain English.
To use it effectively, deploy the container according to the installation guide, connect it to your Druid or ClickHouse clusters, and configure your preferred AI model source (local Ollama models or external providers via OpenAI). Once up, you can ask questions like: “Show me the top 5 revenue sources from last quarter,” or “What were the hourly request counts for the last 24 hours?” The MCP layer will translate these intents into the appropriate DB queries and return interpreted results with explanations of how the data was fetched, preserving a read-only safety posture by default.
How to install
Prerequisites
- Docker and Docker Compose installed on your host
- Access to a Druid or ClickHouse cluster (or both)
- Optional: Ollama for local models, or an API key for OpenAI if you plan to use external models
Automatic (Docker) Installation
-
Ensure Docker is running and you have internet access
-
Pull and run the Data Philter container (example using latest tag):
docker run -d --name data-philter
-p 4000:4000
-e IUNERA_MODEL_TYPE=ollama-m
-e DRUID_SSL_ENABLED=true
-e DRUID_MCP_READONLY_ENABLED=true
-e SPRING_AI_OPENAI_API_KEY=your-api-key
-e DRUID_AUTH_USERNAME=youruser
-e DRUID_AUTH_PASSWORD=yourpass
iunera/data-philter:latest -
Access the web interface at http://localhost:4000
Manual (Docker Compose) Installation
-
Create a docker-compose.yml with a service for the Data Philter image and environment variables, for example:
version: '3.8' services: dataphilter: image: iunera/data-philter:latest container_name: data-philter ports: - '4000:4000' environment: - IUNERA_MODEL_TYPE=ollama-m - DRUID_SSL_ENABLED=true - DRUID_MCP_READONLY_ENABLED=true - SPRING_AI_OPENAI_API_KEY=your-api-key - DRUID_AUTH_USERNAME=youruser - DRUID_AUTH_PASSWORD=yourpass
-
Start the services:
docker-compose up -d
-
Open http://localhost:4000 to access the UI and begin querying.
Prerequisites recap
- Docker installed and running
- Access to Druid and/or ClickHouse clusters
- Optional local models via Ollama or external API keys if desired
Additional notes
Tips and common considerations:
- By default, the Data Philter setup emphasizes read-only query execution to protect data. If you need write capabilities, review and modify the MCP policies in your environment files and ensure the target database permissions reflect your requirements.
- When using local Ollama models, ensure sufficient RAM and GPU/CPU resources are allocated to avoid slow reasoning or timeouts.
- Maintain separate environment files (env templates) per deployment (e.g., druid.env, clickhouse.env) to simplify credentials management and security.
- If the container fails to start, re-run the setup script or re-create environment files as suggested in the installation notes. Check container logs for MCP handshake or model loading errors.
- For Kubernetes deployments, refer to the k8s/README.md in the repo for manifests and best practices around volumes, secrets, and ingress.
Related MCP Servers
netdata
The fastest path to AI-powered full stack observability, even for lean teams.
microsandbox
opensource self-hosted sandboxes for ai agents
core
AI agent microservice
mcpcan
MCPCAN is a centralized management platform for MCP services. It deploys each MCP service using a container deployment method. The platform supports container monitoring and MCP service token verification, solving security risks and enabling rapid deployment of MCP services. It uses SSE, STDIO, and STREAMABLEHTTP access protocols to deploy MCP。
k8s
K8s-mcp-server is a Model Context Protocol (MCP) server that enables AI assistants like Claude to securely execute Kubernetes commands. It provides a bridge between language models and essential Kubernetes CLI tools including kubectl, helm, istioctl, and argocd, allowing AI systems to assist with cluster management, troubleshooting, and deployments
nosia
Self-hosted AI RAG + MCP Platform