mcp -datahub
The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)
claude mcp add --transport stdio acryldata-mcp-server-datahub node src/server.js \ --env TOOLS_IS_USER_ENABLED="false" \ --env SEMANTIC_SEARCH_ENABLED="false" \ --env TOOLS_IS_MUTATION_ENABLED="false" \ --env TOOL_RESPONSE_TOKEN_LIMIT="80000" \ --env ENTITY_SCHEMA_TOKEN_BUDGET="16000" \ --env SAVE_DOCUMENT_PARENT_TITLE="Shared" \ --env SAVE_DOCUMENT_TOOL_ENABLED="true" \ --env SAVE_DOCUMENT_ORGANIZE_BY_USER="false" \ --env SAVE_DOCUMENT_RESTRICT_UPDATES="true" \ --env DATAHUB_MCP_DISABLE_DEFAULT_VIEW="false" \ --env DISABLE_NEWER_GMS_FIELD_DETECTION="false" \ --env DATAHUB_MCP_DOCUMENT_TOOLS_DISABLED="false"
How to use
This MCP server implements DataHub-specific capabilities for AI agents to discover, understand, and reason about your data ecosystem. It exposes a suite of tools that allow agents to search across datasets, lineage, dataset queries, and metadata, as well as manage tags, owners, glossary terms, and domains. Agents can perform structured searches using the /q syntax with boolean logic and field filters, inspect upstream and downstream lineage at table or column granularity, retrieve representative SQL queries, and explore metadata for one or more entities by URN. Mutation tools enable updating metadata such as tags, owners, terms, domains, and descriptions, while user and document tools provide contextual information about the authenticated actor and stored knowledge articles. Use these tools to build data-aware agents that can locate trustworthy data, understand data lineage, and generate SQL examples in the context of your DataHub catalog.
How to install
Prerequisites:
- Node.js (LTS) and npm installed on your system
- Git to clone the repository (optional but common)
Basic installation steps:
-
Clone the MCP server repository or download the release package: git clone https://github.com/acryl-data/mcp-datahub.git cd mcp-datahub
-
Install dependencies: npm install
-
Configure environment (optional, defaults may work for development):
- Create a .env file or export environment variables as needed
- Ensure required environment flags are set (see mcp_config.env in the README for examples)
-
Start the MCP server: npm run build 2>/dev/null || true node src/server.js
-
Verify the server is running by probing the configured endpoint, typically http://localhost:PORT or by checking startup logs.
Note: If you prefer Docker, ensure your docker-compose or docker run command maps the same environment variables and exposes the appropriate port used by the MCP server.
Additional notes
Environment variables: The DataHub MCP server ships with several feature flags to control mutation tools, user tools, document tools, and semantic search. Adjust TOOLS_IS_MUTATION_ENABLED, TOOLS_IS_USER_ENABLED, and SEMANTIC_SEARCH_ENABLED to fit your security and capability requirements. The SAVE_DOCUMENT_* variables control how and where saved documents are organized. If you enable mutation tools in production, ensure proper governance and access controls. For large catalogs, monitor TOOL_RESPONSE_TOKEN_LIMIT and ENTITY_SCHEMA_TOKEN_BUDGET to balance tool output length with performance. If you experience issues with data discovery or lineage, confirm that DataHub integration endpoints are reachable and that DataHub catalog permissions allow the requested operations. For troubleshooting, consult the DataHub MCP server docs linked in the README.
Related MCP Servers
mcp-vegalite
MCP server from isaacwasserman/mcp-vegalite-server
github-chat
A Model Context Protocol (MCP) for analyzing and querying GitHub repositories using the GitHub Chat API.
nautex
MCP server for guiding Coding Agents via end-to-end requirements to implementation plan pipeline
pagerduty
PagerDuty's official local MCP (Model Context Protocol) server which provides tools to interact with your PagerDuty account directly from your MCP-enabled client.
futu-stock
mcp server for futuniuniu stock
mcp -boilerplate
Boilerplate using one of the 'better' ways to build MCP Servers. Written using FastMCP