
MCP Browser Automation Server
MCP server for browser automation with screenshot and console logging capabilities
what is MCP Browser Automation Server?
MCP Browser Automation Server is a powerful tool that allows users to automate browser actions, take screenshots, and monitor console logs through a REST API.
how to use MCP Browser Automation Server?
To use the server, clone the repository, set up a virtual environment, install dependencies, and start the server. You can then interact with the API to create sessions and perform various browser actions.
key features of MCP Browser Automation Server?
- Create and manage browser sessions
- Navigate to specified URLs
- Capture full-page or element-specific screenshots
- Interact with web elements (click, fill forms)
- Real-time monitoring of console logs via WebSocket
- Close browser sessions when done
use cases of MCP Browser Automation Server?
- Automated testing of web applications
- Web scraping for data collection
- Monitoring website performance and logs
- Taking screenshots for documentation or reporting
FAQ from MCP Browser Automation Server?
- What programming language is used?
The server is built using Python.
- Is it necessary to install Playwright?
Yes, Playwright is required to run the browser automation tasks.
- Can I run multiple sessions simultaneously?
Yes, you can create multiple sessions to run tasks in parallel.
MCP Browser Automation Server
A simple but powerful browser automation server that allows you to control browsers, take screenshots, and monitor console logs through a REST API.
Features
- Create browser sessions
- Navigate to URLs
- Take screenshots (full page or specific elements)
- Click elements
- Fill form inputs
- Monitor console logs in real-time through WebSocket
- Close sessions
Installation
- Clone this repository:
git clone https://github.com/weir1/mcp-browser-automation.git
cd mcp-browser-automation
- Create a virtual environment and activate it:
python -m venv venv
.\venv\Scripts\Activate
- Install dependencies:
pip install -r requirements.txt
- Install Playwright browsers:
playwright install
Usage
- Start the server:
python server.py
The server will start on http://localhost:8000
API Endpoints
Create a new session
POST /session/create
Response: { "session_id": "..." }
Navigate to a URL
POST /session/{session_id}/navigate?url=https://example.com
Take a screenshot
POST /session/{session_id}/screenshot?name=screenshot1&selector=.my-element
If selector is not provided, takes a full page screenshot.
Click an element
POST /session/{session_id}/click?selector=.my-button
Fill an input
POST /session/{session_id}/fill?selector=input[name="username"]&value=myuser
Monitor console logs
WebSocket /session/{session_id}/console
Close a session
POST /session/{session_id}/close
Example Usage with Python
import requests
import websockets
import asyncio
import json
# Create a session
response = requests.post("http://localhost:8000/session/create")
session_id = response.json()["session_id"]
# Navigate to a URL
requests.post(f"http://localhost:8000/session/{session_id}/navigate?url=https://example.com")
# Take a screenshot
response = requests.post(f"http://localhost:8000/session/{session_id}/screenshot?name=example")
with open("screenshot.png", "wb") as f:
f.write(response.content)
# Monitor console logs
async def monitor_console():
async with websockets.connect(f"ws://localhost:8000/session/{session_id}/console") as ws:
while True:
message = await ws.recv()
print(json.loads(message))
asyncio.get_event_loop().run_until_complete(monitor_console())
License
MIT
