what is Biomart MCP?
Biomart MCP is a server that interfaces with Biomart, utilizing the Model Context Protocol (MCP) to standardize how applications provide context to large language models (LLMs).
how to use Biomart MCP?
To use Biomart MCP, clone the repository from GitHub, install the necessary dependencies, and run the server using the provided commands. You can also integrate it with other models via Cursor's agent mode.
key features of Biomart MCP?
- Mart and Dataset Discovery: List available marts and datasets in the Biomart database.
- Attribute and Filter Exploration: View available attributes and filters for specific datasets.
- Data Retrieval: Query Biomart for biological data using specific attributes and filters.
- ID Translation: Convert between different biological identifiers.
use cases of Biomart MCP?
- Researchers can retrieve biological data for analysis.
- Developers can integrate Biomart data into applications.
- Educators can use it to demonstrate biological data retrieval techniques.
FAQ from Biomart MCP?
- What is the Model Context Protocol (MCP)?
MCP is an open protocol that standardizes how applications provide context to LLMs.
- Can I contribute to Biomart MCP?
Yes! Contributions are welcome through pull requests on GitHub.
- What programming language is Biomart MCP written in?
Biomart MCP is written in Python.
Biomart MCP
A MCP server to interface with Biomart
Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to LLMs developed by Anthropic. Here we use the MCP python-sdk to create a MCP server that interfaces with Biomart via the pybiomart package.
There is a short demo video showing the MCP server in action on Claude Desktop.
Installation
Clone the repository
git clone https://github.com/jzinno/biomart-mcp.git
cd biomart-mcp
Claude Desktop
uv run --with mcp[cli] mcp install --with pybiomart biomart-mcp.py
Cursor
Via Cusror's agent mode, other models can take advantage of MCP servers as well, such as those form OpenAI or DeepSeek. Click the cursor setting cogwheel and naviagate to MCP
and either add the MCP server to the global config or add it to the a project scope by adding .cursor/mcp.json
to the project.
Example .cursor/mcp.json
:
{
"mcpServers": {
"Biomart": {
"command": "uv",
"args": [
"run",
"--with",
"mcp[cli]",
"--with",
"pybiomart",
"mcp",
"run",
"/your/path/to/biomart-mcp.py"
]
}
}
}
Glama
Development
# Create a virtual environment
uv venv
# MacOS/Linux
source .venv/bin/activate
# Windows
.venv\Scripts\activate
uv sync #or uv add mcp[cli] pybiomart
# Run the server in dev mode
mcp dev biomart-mcp.py
Features
Biomart-MCP provides several tools to interact with Biomart databases:
- Mart and Dataset Discovery: List available marts and datasets to explore the Biomart database structure
- Attribute and Filter Exploration: View common or all available attributes and filters for specific datasets
- Data Retrieval: Query Biomart with specific attributes and filters to get biological data
- ID Translation: Convert between different biological identifiers (e.g., gene symbols to Ensembl IDs)
Contributing
Pull requests are welcome! Some small notes on development:
- We are only using
@mcp.tool()
here by design, this is to maximize compatibility with clients that support MCP as seen in the docs. - We are using
@lru_cache
to cache results of functions that are computationally expensive or make external API calls. - We need to be mindful to not blow up the context window of the model, for example you'll see
df.to_csv(index=False).replace("\r", "")
in many places. This csv style return is much more token efficient than something likedf.to_string()
where the majority of the tokens are whitespace. Also be mindful of the fact that pulling all genes from a chromosome or similar large request will also be too large for the context window.
Potential Future Features
There of course many more features that could be added, some maybe beyond the scope of the name biomart-mcp
. Here are some ideas:
- Add webscraping for resource sites with
bs4
, for example we got the Ensembl gene ID for NOTCH1 then maybe in some cases it would be usful to grap the collatedComments and Description Text from UniProtKB
section from it's page on UCSC - $...$