Datahub

Datahub

By acryldata GitHub

-

datahub metadata
Overview

what is Datahub?

Datahub is a Model Context Protocol server implementation that allows AI agents to query for metadata and context about your data ecosystem.

how to use Datahub?

To use Datahub, you can authenticate by configuring a global ~/.datahubenv file using datahub init, or by setting the appropriate environment variables for your DataHub instance.

key features of Datahub?

  • Search across all entity types with arbitrary filters.
  • Fetch metadata for any entity.
  • Traverse the lineage graph, both upstream and downstream.
  • List SQL queries associated with a dataset.

use cases of Datahub?

  1. Enabling AI agents to understand data context and metadata.
  2. Facilitating data lineage tracking for compliance and auditing.
  3. Assisting data scientists in querying and retrieving relevant data efficiently.

FAQ from Datahub?

  • What is the purpose of Datahub?

Datahub serves as a centralized metadata repository that allows AI agents to interact with and understand the data ecosystem.

  • How do I authenticate with Datahub?

You can authenticate using the datahub init command or by setting environment variables for your DataHub instance.

  • Is Datahub compatible with both OSS and Cloud versions?

Yes! Datahub supports both DataHub OSS and DataHub Cloud.

Content

mcp-server-datahub

A Model Context Protocol server implementation for DataHub. This enables AI agents to query DataHub for metadata and context about your data ecosystem.

Supports both DataHub OSS and DataHub Cloud.

Features

  • Searching across all entity types and using arbitrary filters
  • Fetching metadata for any entity
  • Traversing the lineage graph, both upstream and downstream
  • Listing SQL queries associated with a dataset

Usage

For authentication, you can either use datahub init to configure a global ~/.datahubenv file, or you can set the appropriate environment variables:

uvx --from acryl-datahub datahub init   # follow the prompts

# Alternatively, use these environment variables:
export DATAHUB_GMS_URL=https://name.acryl.io/gms
export DATAHUB_GMS_TOKEN=<your-token>

Claude Desktop

In your claude_desktop_config.json file, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "uvx",
      "args": ["mcp-server-datahub"]
    }
  }
}

Cursor

In .cursor/mcp.json, add the following:

{
  "mcpServers": {
    "datahub": {
      "command": "uvx",
      "args": ["mcp-server-datahub"]
      "env": {}
    }
  }
}

Other MCP Clients

command: uvx
args:
  - mcp-server-datahub

Developing

See DEVELOPING.md.

No tools information available.
No content found.