mcp-server-datahub

mcp-server-datahub

By acryldata GitHub

The official MCP server for DataHub (

Overview

what is mcp-server-datahub?

The mcp-server-datahub is a data integration server designed to facilitate the synchronization and management of data across various sources.

how to use mcp-server-datahub?

To use mcp-server-datahub, set up the environment by running the provided setup commands, initialize the datahub token, and run the server in development mode.

key features of mcp-server-datahub?

  • Easy setup and configuration for data synchronization.
  • Development mode for testing and debugging.
  • Integration with the datahub for seamless data management.

use cases of mcp-server-datahub?

  1. Synchronizing data from multiple sources for analysis.
  2. Managing data workflows in research projects.
  3. Facilitating data integration in cloud applications.

FAQ from mcp-server-datahub?

  • What programming language is mcp-server-datahub built with?

mcp-server-datahub is built using Python.

  • How do I run the server?

You can run the server by activating the virtual environment and executing the command mcp dev mcp_server.py.

  • Is there any documentation available?

Yes, you can find the documentation on the project's GitHub page.

Content

mcp-server-datahub

A Model Context Protocol server implementation for DataHub. This enables AI agents to query DataHub for metadata and context about your data ecosystem.

Supports both DataHub Core and DataHub Cloud.

Features

  • Searching across all entity types and using arbitrary filters
  • Fetching metadata for any entity
  • Traversing the lineage graph, both upstream and downstream
  • Listing SQL queries associated with a dataset

Demo

Check out the demo video, done in collaboration with the team at Block.

Usage

  1. Install uv

    # On macOS and Linux.
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Locate your authentication details

    For authentication, you'll need the following:

    Alternative: Using ~/.datahubenv for authentication

    You can also use a ~/.datahubenv file to configure your authentication. The easiest way to create this file is to run datahub init and follow the prompts.

    uvx --from acryl-datahub datahub init
    
  3. Configure your MCP client. See below - this will vary depending on your agent.

Claude Desktop

Run which uvx to find the full path to the uvx command.

In your claude_desktop_config.json file, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "<full-path-to-uvx>",  # e.g. /Users/hsheth/.local/bin/uvx
      "args": ["mcp-server-datahub"],
      "env": {
        "DATAHUB_GMS_URL": "<your-datahub-url>",
        "DATAHUB_GMS_TOKEN": "<your-datahub-token>"
      }
    }
  }
}

Cursor

In .cursor/mcp.json, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "uvx",
      "args": ["mcp-server-datahub"],
      "env": {
        "DATAHUB_GMS_URL": "<your-datahub-url>",
        "DATAHUB_GMS_TOKEN": "<your-datahub-token>"
      }
    }
  }
}

Other MCP Clients

command: uvx
args:
  - mcp-server-datahub
env:
  DATAHUB_GMS_URL: <your-datahub-url>
  DATAHUB_GMS_TOKEN: <your-datahub-token>

Troubleshooting

spawn uvx ENOENT

The full stack trace might look like this:

2025-04-08T19:58:16.593Z [datahub] [error] spawn uvx ENOENT {"stack":"Error: spawn uvx ENOENT\n    at ChildProcess._handle.onexit (node:internal/child_process:285:19)\n    at onErrorNT (node:internal/child_process:483:16)\n    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)"}

Solution: Replace the uvx bit of the command with the output of which uvx.

Developing

See DEVELOPING.md.

No tools information available.
No content found.