mcp-datahub
A DataHub MCP Server and composable Go library for building custom MCP servers that integrate DataHub metadata capabilities. Part of the txn2 MCP toolkit ecosystem.
claude mcp add --transport stdio txn2-mcp-datahub docker run -i txn2/mcp-datahub \
--env DATAHUB_URL="https://datahub.example.com" \
--env DATAHUB_TOKEN="your_token" \
--env DATAHUB_CONNECTION_NAME="default" \
--env DATAHUB_ADDITIONAL_SERVERS="{"staging":{"url":"https://staging.datahub.example.com/api/graphql","token":"staging-token"}}"How to use
mcp-datahub exposes a composable MCP server that connects AI assistants to DataHub metadata catalogs. It enables search across datasets, exploration of schemas, tracing of data lineage, and access to glossary terms and domains through a suite of built-in MCP tools. You can run the server standalone or combine it with other MCP components to build a unified data-platform experience. Once running, connect your MCP-compatible clients (like Claude Desktop or other MCP clients) and use the available tools to query DataHub-backed metadata, retrieve schema details, inspect lineage graphs, and discover glossary terms.
The server provides a Go-based library with a toolkit that you can embed in your own MCP server, or you can enable multi-server setups to connect to multiple DataHub instances. Tools include search, lineage, schema operations, glossary access, domains, and additional enterprise features via optional middleware and extensions. You can customize tool descriptions and annotations to suit your deployment and enable bidirectional context with query providers to enrich responses with execution context from query engines such as Trino.
How to install
Prerequisites:
- Go 1.19+ installed on your system
- Git installed
- Access to install Go binaries (GOPATH or Go modules workspace)
Installation steps (standalone Go server):
-
Install the MCP DataHub server binary:
- Go install the package (latest): go install github.com/txn2/mcp-datahub/cmd/mcp-datahub@latest
-
Verify the binary is in your PATH:
- On Unix-like systems: which mcp-datahub
- On Windows: where mcp-datahub
-
Run the server (example):
- For a standalone datahub deployment, start the binary directly: mcp-datahub
-
If you prefer Docker (for containerized deployment):
- Pull and run the image: docker run -i txn2/mcp-datahub
-
Configure environment variables as needed (see details below):
- DATAHUB_URL: URL of your DataHub instance
- DATAHUB_TOKEN: API token for authentication
- DATAHUB_CONNECTION_NAME: Optional logical name for this connection
- DATAHUB_ADDITIONAL_SERVERS: Optional JSON mapping of additional DataHub connections
Notes:
- If you plan to use multiple DataHub instances or separate environments (prod, staging), use DATAHUB_ADDITIONAL_SERVERS to register them and then select the appropriate connection when invoking tools.
Additional notes
Tips and common considerations:
- Environment variables DATAHUB_URL and DATAHUB_TOKEN are required for connecting to DataHub. Ensure tokens have appropriate read/write permissions as needed by your usage.
- DATAHUB_CONNECTION_NAME helps distinguish multiple DataHub connections in multi-server setups.
- You can register additional DataHub connections via DATAHUB_ADDITIONAL_SERVERS (JSON object). Use datahub_list_connections or equivalent tooling to discover available connections.
- When embedding the library into your own MCP server, you can customize tool descriptions, annotations, and enable middleware for access control and auditing.
- If running behind a proxy or enterprise network, ensure network paths to DataHub are reachable and that TLS/SSL certificates are trusted by the host.
- For multi-server compositions (e.g., DataHub + Trino), use the provided integration guidance to register tools from multiple sources and enable query-context enrichment.
- Review the library docs for details on extension middleware, config loading, and advanced tool customization.
Related MCP Servers
mcp-language
mcp-language-server gives MCP enabled clients access semantic tools like get definition, references, rename, and diagnostics.
kodit
👩💻 MCP server to index external repositories
mcp-chain-of-draft
Chain of Draft Server is a powerful AI-driven tool that helps developers make better decisions through systematic, iterative refinement of thoughts and designs. It integrates seamlessly with popular AI agents and provides a structured approach to reasoning, API design, architecture decisions, code reviews, and implementation planning.
mcp
Teamwork.com MCP server
ai-create
ai-create-mcp is a Go-based tool that converts OpenAPI Specification (OAS) files into a Model Context Protocol (MCP) program.
mcp-stockfish
🐟 MCP server connecting AI systems to Stockfish chess engine