AI Vision MCP Server

By MCP-Mirror GitHub

Mirror of

Overview

what is AI Vision MCP Server?

AI Vision MCP Server is a Model Context Protocol (MCP) server that provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants.

how to use AI Vision MCP Server?

To use the AI Vision MCP Server, clone the repository, install dependencies, and start the server. Configure it in your MCP setup to enable its features.

key features of AI Vision MCP Server?

Capture screenshots of any website by providing a URL.
Analyze UI elements, layouts, and content in screenshots.
Read and modify files with line-specific precision.
Generate comprehensive UI/UX analysis reports.
Maintain context across multiple analysis steps during debugging sessions.

use cases of AI Vision MCP Server?

Taking screenshots for UI testing.
Analyzing website layouts for design improvements.
Generating reports for UI/UX evaluations.

FAQ from AI Vision MCP Server?

What are the requirements to run the server?

Node.js 14+, Playwright for browser automation, and a Gemini API key for AI vision analysis are required.

Is there a license for this project?

Yes, the project is licensed under MIT.

How can I contribute to the project?

You can contribute by submitting issues or pull requests on the GitHub repository.

Content

AI Vision MCP Server

A Model Context Protocol (MCP) server that provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants.

Features

Screenshot URL: Capture screenshots of any website by providing a URL
Visual Analysis: Analyze UI elements, layouts, and content in screenshots
File Operations: Read and modify files with line-specific precision
Report Generation: Create comprehensive UI/UX analysis reports
Debugging Session: Maintain context across multiple analysis steps

Installation

# Clone the repository
git clone https://github.com/samihalawa/mcp-server-ai-vision.git
cd mcp-server-ai-vision

# Install dependencies
npm install

# Build the server
npm run build

Usage

Starting the Server

npm start

Configuration

Add the server to your MCP configuration:

{
  "servers": {
    "ai-vision": {
      "command": "/path/to/node",
      "args": ["/path/to/mcp-server-ai-vision/build/index.js"],
      "enabled": true,
      "port": 3005,
      "environment": {
        "NODE_PATH": "/path/to/node_modules",
        "PATH": "/usr/local/bin:/usr/bin:/bin",
        "GEMINI_API_KEY": "your-gemini-api-key"
      }
    }
  }
}

Available Tools

screenshot_url

Take a screenshot of a URL using a web browser.

Parameters:

url (string, required): URL to capture a screenshot of (e.g., http://localhost:4999, https://google.com)
fullPage (boolean, optional): Whether to capture full page or just viewport. Default: false
waitForSelector (string, optional): CSS selector to wait for before taking screenshot
waitTime (number, optional): Time to wait in milliseconds before taking screenshot. Default: 1000

analyze_screen

Analyze a screenshot with AI vision.

Parameters: None (uses the most recent screenshot)

read_file

Read content from a file between specified line numbers.

Parameters:

path (string): Path to the file
startLine (number): Starting line number (1-indexed)
endLine (number): Ending line number (1-indexed)

modify_file

Modify content in a file between specified line numbers.

Parameters:

path (string): Path to the file
startLine (number): Starting line number to replace (1-indexed)
endLine (number): Ending line number to replace (1-indexed)
content (string): New content to replace the specified lines

generate_report

Generate a comprehensive UI/UX analysis report.

Parameters:

testUrl (string): URL of the application being tested
appName (string, optional): Name of the application being analyzed
date (string, optional): Date of the analysis (YYYY-MM-DD)
observations (object): Observations structured as components, data state, interactions, etc.

Example Workflow

Take a screenshot of a website:

screenshot_url(url: "https://example.com")

Analyze the screenshot:
```
analyze_screen()
```

Generate a report based on the analysis:

generate_report(testUrl: "https://example.com", observations: {...})

Requirements

Node.js 14+
Playwright for browser automation
Gemini API key for AI vision analysis

License

MIT

No tools information available.

No content found.