
Deep Research MCP Server 🚀
MCP Deep Research Server using Gemini creating a Research AI Agent
What is Deep Research MCP?
Deep Research MCP is an AI-powered research assistant that utilizes Gemini to perform iterative, deep research on various topics by integrating search engines, web scraping, and large language models. It is designed to create a research AI agent that refines its research direction over time.
How to use Deep Research MCP?
To use Deep Research MCP, you can either run it as an MCP tool for AI agents or use it standalone via the command line interface. For MCP, start the server and invoke it with a research query, depth, and breadth parameters. For standalone usage, run the CLI with your research query.
Key features of Deep Research MCP?
- MCP Integration: Seamless integration with AI agents.
- Iterative Research: Generates search queries and processes results iteratively.
- Intelligent Query Generation: Uses Gemini LLMs for targeted search queries.
- Depth & Breadth Control: Configurable parameters for research depth and breadth.
- Comprehensive Reports: Produces detailed markdown reports with findings.
- Persistent Knowledge: Utilizes PostgreSQL for storing research data.
Use cases of Deep Research MCP?
- Conducting in-depth research on emerging technologies.
- Gathering comprehensive information for academic papers.
- Assisting AI agents in knowledge retrieval and processing.
FAQ from Deep Research MCP?
- Can Deep Research MCP handle all research topics?
Yes! It can research a wide range of topics by leveraging various data sources.
- Is there a limit to the depth and breadth of research?
No, you can configure the depth and breadth parameters according to your needs.
- How does it store research data?
It uses a PostgreSQL database to persistently store research findings and URLs.
Deep Research MCP Server 🚀
📚 Table of Contents
🤖 Deep Research Gemini
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and Gemini large language models. Available as a Model Context Protocol (MCP) tool for seamless integration with AI agents.
The goal of this repo is to provide the simplest implementation of a deep research agent - e.g. an agent that can refine its research direction over time and deep dive into a topic. Goal is to keep the repo size at <500 LoC so it is easy to understand and build on top of.
🔄 Research Agent + PostgreSQL Integration
The research agent works seamlessly with PostgreSQL to create an efficient research system:
- Knowledge Persistence: Each research finding and URL is stored in PostgreSQL, creating a growing knowledge base
- Smart Caching: Previously processed URLs are tracked to avoid duplicate processing
- Learning Context: The agent can reference past findings to guide new research directions
- Query Optimization: Similar research queries can leverage existing database knowledge
- Efficient Retrieval: Fast access to historical research data through indexed PostgreSQL queries
This integration enables the agent to build upon previous research while maintaining a lightweight codebase.
✨ How It Works
flowchart TB
subgraph Input
Q[User Query]
B[Breadth Parameter]
D[Depth Parameter]
end
DR[Deep Research] -->
SQ[SERP Queries] -->
PR[Process Results]
subgraph Results[Results]
direction TB
NL((Learnings))
ND((Directions))
end
PR --> NL
PR --> ND
DP{depth > 0?}
RD["Next Direction:
- Prior Goals
- New Questions
- Learnings"]
MR[Markdown Report]
DB[PostgreSQL Database]
%% Main Flow
Q & B & D --> DR
%% Results to Decision
NL & ND --> DP
%% Circular Flow
DP -->|Yes| RD
RD -->|New Context| DR
%% Final Output
DP -->|No| MR
DR --> DB
DB --> NL
%% Styling
classDef input fill:#7bed9f,stroke:#2ed573,color:black
classDef process fill:#70a1ff,stroke:#1e90ff,color:black
classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
classDef output fill:#ff4757,stroke:#ff6b81,color:black
classDef results fill:#a8e6cf,stroke:#3b7a57,color:black
class Q,B,D input
class DR,SQ,PR process
class DP,RD recursive
class MR output
class NL,ND results
class DB output
🌟 Features
- MCP Integration: Available as a Model Context Protocol tool for seamless integration with AI agents
- Iterative Research: Performs deep research by iteratively generating search queries, processing results, and diving deeper based on findings
- Intelligent Query Generation: Uses Gemini LLMs to generate targeted search queries based on research goals and previous findings
- Depth & Breadth Control: Configurable parameters to control how wide (breadth) and deep (depth) the research goes
- Smart Follow-up: Generates follow-up questions to better understand research needs
- Comprehensive Reports: Produces detailed markdown reports with findings and sources
- Concurrent Processing: Handles multiple searches and result processing in parallel for efficiency
- Persistent Knowledge with PostgreSQL: 🐘 Leverages a PostgreSQL database for storing research data, ensuring data persistence across sessions and enabling efficient retrieval of past findings. This allows the agent to build upon previous knowledge and avoid redundant research.
⚙️ Requirements
- Node.js environment (v22.x recommended)
- API keys for:
- Firecrawl API 🕸️ (for web search and content extraction)
- Gemini API 🧠 (for Gemini 2.0 models)
- PostgreSQL database 🐘 (running locally or remotely)
🛠️ Setup
Node.js
- Clone the repository
- Install dependencies:
npm install
- Set up environment variables in a
.env.local
file:
GEMINI_API_KEY="your_gemini_key"
FIRECRAWL_KEY="your_firecrawl_key"
# Optional: If you want to use your self-hosted Firecrawl
# FIRECRAWL_BASE_URL="http://localhost:3002"
DATABASE_URL="postgresql://username:password@localhost:5432/db" # 🐘 PostgreSQL connection string
- Build the project:
npm run build
🐘 PostgreSQL Database for Persistent Research
This project uses PostgreSQL to store research data, providing local storage for learnings and visited URLs. This allows the agent to:
- Recall previous research: Avoid re-running the same queries and re-processing the same content.
- Build upon existing knowledge: Use past learnings to guide future research directions.
- Maintain a consistent knowledge base: Ensure that the agent's knowledge is consistent across sessions.
- Ensure you have a PostgreSQL database running locally or remotely.
- Set the
DATABASE_URL
environment variable in your.env.local
file with the connection string to your PostgreSQL database.- The
DATABASE_URL
should follow the format:postgresql://user:password@host:port/database
- Example:
postgresql://user:password@host:port/database
- The
- The database will be automatically created and the
research_data
table will be created if it doesn't exist.
Testing the Database Connection
To test the database connection, run the following command:
node src/db.ts
This will attempt to connect to the database and print a success or failure message to the console.
🚀 Usage
As an MCP Tool
The deep research functionality is available as an MCP tool that can be used by AI agents. To start the MCP server:
node --env-file .env.local dist/mcp-server.js
The tool provides the following parameters:
query
(string): The research query to investigatedepth
(number, 1-5): How deep to go in the research treebreadth
(number, 1-5): How broad to make each research levelexistingLearnings
(string[], optional): Array of existing research findings to build upon
Example tool usage in an agent:
const result = await mcp.invoke("deep-research", {
query: "What are the latest developments in quantum computing?",
depth: 3,
breadth: 3
});
The tool returns:
- A detailed markdown report of the findings
- List of sources used in the research
- Metadata about learnings and visited URLs
Standalone Usage
For standalone usage without MCP, you can use the CLI interface:
npm run start "Your research query here"
To test the MCP server with the inspector:
npx @modelcontextprotocol/inspector node --env-file .env.local dist/mcp-server.js
🔗 Technologies Used
- Node.js 🟢
- TypeScript 🟦
- Gemini API 🧠
- Firecrawl API 🕸️
- PostgreSQL 🐘
- Model Context Protocol 🧩
📜 License
MIT License - feel free to use and modify as needed.