
Data Dictionary MCP
A Model Context Protocol (MCP) server that coordinates AI agents to transform database tables into Wikipedia-style data dictionaries.
What is Data Dictionary MCP?
Data Dictionary MCP is a Model Context Protocol (MCP) server that automates the transformation of database tables into comprehensive, Wikipedia-style data dictionaries using AI agents.
How to use Data Dictionary MCP?
To use Data Dictionary MCP, clone the repository, set up a Python virtual environment, install the dependencies, and run the application to start processing your database files.
Key features of Data Dictionary MCP?
- Multi-format support for JSON, CSV, and Plain Text files.
- AI-powered analysis for generating field descriptions and identifying relationships.
- Integration with the Model Context Protocol for coordinating AI agents.
- Schema extraction from various formats into a unified representation.
- Output in a familiar, Wikipedia-style format.
Use cases of Data Dictionary MCP?
- Automating the creation of data dictionaries for large databases.
- Enhancing data documentation for better accessibility and understanding.
- Supporting data governance and compliance initiatives by providing clear data definitions.
FAQ from Data Dictionary MCP?
- What formats does Data Dictionary MCP support?
Currently, it supports JSON, CSV, and Plain Text, with plans for more formats in the future.
- Is Data Dictionary MCP open source?
Yes! The project is open source and available under the MIT License.
- How can I contribute to the project?
Contributions are welcome! You can submit a Pull Request on GitHub.
Data Dictionary MCP
A Model Context Protocol (MCP) server that coordinates AI agents to transform database tables into Wikipedia-style data dictionaries.
Overview
The Data Dictionary MCP project automates the conversion of various database formats into comprehensive, human-readable data dictionaries using AI-powered analysis and description. It leverages the Model Context Protocol (MCP) to coordinate AI agents for analyzing, describing, and verifying database structures.
Features
- Multi-Format Support: Process JSON, CSV, and Plain Text files (with more formats planned)
- AI-Powered Analysis: Generate field descriptions and identify relationships
- MCP Integration: Coordinate AI agents using the Model Context Protocol
- Schema Extraction: Extract database schemas from various formats into a unified representation
- Wikipedia-Style Output: Present data dictionaries in a familiar, accessible format
Project Status
This project is in active development. See the Project Roadmap for details.
Getting Started
Prerequisites
- Python 3.9+
- Git
- pip or poetry for dependency management
Installation
-
Clone the repository:
git clone https://github.com/jonahkeegan/data-dictionary-mcp.git cd data-dictionary-mcp
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python src/main.py
Project Structure
data-dictionary-mcp/
├── docs/ # Documentation
├── src/ # Source code
│ ├── mcp/ # MCP server components
│ ├── analyzers/ # Format analyzers
│ ├── agents/ # Agent coordination
│ └── dictionary/ # Dictionary generation
├── tests/ # Test suite
├── memory-bank/ # Cline memory bank
├── .gitignore
├── .clinerules # Cline rules
├── README.md
└── requirements.txt
Project Roadmap
Milestone 1: MCP Server Foundation and Format Analyzers
- Implement MCP server with basic tool definitions
- Develop format analyzers for JSON, CSV, and Plain Text
- Create schema extraction system
- Implement unit tests for core components
Milestone 2: AI Agent Coordination and Field Description
- Implement agent coordination system
- Develop field description generation
- Create task distribution and result aggregation
- Add integration tests
Milestone 3: Content Verification and Publishing
- Implement content validation
- Develop Wikipedia-style formatting
- Create export capabilities
- Add end-to-end tests
Milestone 4: User Interface and Deployment
- Develop web interface
- Implement search capabilities
- Add user feedback system
- Create deployment infrastructure
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is open source and available under the MIT License.