mcp-data-platform
A semantic data platform MCP server that composes multiple data tools with bidirectional cross-injection - tool responses automatically include critical context from other services.
claude mcp add --transport stdio txn2-mcp-data-platform docker run -i txn2/mcp-data-platform:latest \ --env MCP_PORT="Port the MCP server listens on (default if needed)" \ --env DATAHUB_ENDPOINT="DataHub endpoint URL (optional placeholder)"
How to use
mcp-data-platform serves as the orchestration layer that enriches AI-driven data queries with semantic context from a DataHub-backed semantic layer. It connects to Trino for SQL querying and to DataHub for metadata such as ownership, deprecation status, and quality scores, then surfaces this information in a single, enriched response. The system is designed to let AI assistants describe data with meaning and governance context, rather than just raw schema.
To use the platform, deploy the MCP server and connect your data tools to the included ecosystem components (DataHub, Trino, and optionally S3 for object storage). Once running, you can issue data-related queries to your AI assistant and receive responses annotated with context like table ownership, data quality, deprecation notices, and column meanings. The platform supports cross-injection across services so that results from Trino, DataHub, and S3 are combined into a unified, governance-aware answer. Features such as workflow gating, role-based access, and comprehensive auditing help ensure secure, compliant, and explainable data access for AI workloads.
How to install
Prerequisites:
- Docker installed and running
- Access to a DataHub instance for semantic context
- Optional: Trino and S3 integrations configured
Installation steps:
-
Pull or build the MCP Data Platform Docker image: docker pull txn2/mcp-data-platform:latest
-
Run the MCP server container (example): docker run -d --name mcp-data-platform
-p 8080:8080
-e MCP_PORT=8080
-e DATAHUB_ENDPOINT="https://datahub.example.com"
txn2/mcp-data-platform:latest -
Verify the server is running by checking logs or visiting the health endpoint (if exposed).
-
Configure your client tooling to point at the running MCP server endpoint and ensure Trino/DataHub/S3 integrations are accessible as described in the platform docs.
Additional notes
Tips and caveats:
- Ensure DataHub is accessible and contains the semantic metadata for your datasets to maximize context enrichment.
- If you use TLS or a reverse proxy in front of the MCP server, configure appropriate certificates and endpoint URLs.
- Monitor auditing and access logs to ensure compliance and traceability of AI-driven data access.
- The system supports role-based access control; adjust your identity provider and wildcard rules to grant appropriate permissions.
- If you encounter issues with dependencies, consult the Knowledge Capture and Admin APIs documentation linked in the project docs for debugging data flows and changesets.
Related MCP Servers
sandbox
A Model Context Protocol (MCP) server that enables LLMs to run ANY code safely in isolated Docker containers.
ophis
Transform any Cobra CLI into an MCP server
github-brain
An experimental GitHub MCP server with local database.
mcp-tts
MCP Server for Text to Speech
tasker
An MCP server for Android's Tasker automation app.
kai
An MCP Server for Kubernetes