Computer Control MCP

Computer Control MCP

By AB498 GitHub

MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

automation computer-use
Overview

What is Computer Control MCP?

Computer Control MCP is a server that provides computer control capabilities, allowing users to control mouse movements, keyboard inputs, and perform OCR (Optical Character Recognition) tasks using libraries like PyAutoGUI and RapidOCR. It operates with zero external dependencies and is similar to the 'computer-use' project by Anthropic.

How to use Computer Control MCP?

To use Computer Control MCP, you can set it up using the command uvx computer-control-mcp@latest or install it globally with pip install computer-control-mcp. After installation, run the server with computer-control-mcp to start using its features.

Key features of Computer Control MCP?

  • Control mouse movements and clicks
  • Type text at the current cursor position
  • Take screenshots with optional OCR text extraction
  • List and activate windows
  • Perform drag and drop operations

Use cases of Computer Control MCP?

  1. Automating repetitive tasks on a computer.
  2. Extracting text from images or screenshots for data processing.
  3. Creating automated testing scripts for software applications.

FAQ from Computer Control MCP?

  • Is Computer Control MCP cross-platform?

It has only been tested on Windows, but it should work on other platforms as well.

  • What libraries does it use?

It uses PyAutoGUI for mouse and keyboard control, RapidOCR for text extraction, and ONNXRuntime for model inference.

  • Is there a license for this project?

Yes, it is licensed under the MIT License.

Content

Computer Control MCP

MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

  • Only tested on Windows. Should work on other platforms.
Discord License: MIT

MCP Computer Control Demo

Quick Usage (MCP Setup Using uvx)

{
  "mcpServers": {
    "computer-control-mcp": {
      "command": "uvx",
      "args": ["computer-control-mcp@latest"]
    }
  }
}

OR install globally with pip:

pip install computer-control-mcp

Then run the server with:

computer-control-mcp # instead of uvx computer-control-mcp, so you can use the latest version, also you can `uv cache clean` to clear the cache and `uvx` again to use latest version.

Features

  • Control mouse movements and clicks
  • Type text at the current cursor position
  • Take screenshots of the entire screen or specific windows with optional saving to downloads directory
  • Extract text from screenshots using OCR (Optical Character Recognition)
  • List and activate windows
  • Press keyboard keys
  • Drag and drop operations

Available Tools

Mouse Control

  • click_screen(x: int, y: int): Click at specified screen coordinates
  • move_mouse(x: int, y: int): Move mouse cursor to specified coordinates
  • drag_mouse(from_x: int, from_y: int, to_x: int, to_y: int, duration: float = 0.5): Drag mouse from one position to another

Keyboard Control

  • type_text(text: str): Type the specified text at current cursor position
  • press_key(key: str): Press a specified keyboard key

Screen and Window Management

  • take_screenshot(title_pattern: str = None, use_regex: bool = False, threshold: int = 60, with_ocr_text_and_coords: bool = False, scale_percent_for_ocr: int = 100, save_to_downloads: bool = False): Capture screen or window with optional OCR
  • get_screen_size(): Get current screen resolution
  • list_windows(): List all open windows
  • activate_window(title_pattern: str, use_regex: bool = False, threshold: int = 60): Bring specified window to foreground

Development

Setting up the Development Environment

# Clone the repository
git clone https://github.com/AB498/computer-control-mcp.git
cd computer-control-mcp

# Install in development mode
pip install -e .

Running Tests

python -m pytest

API Reference

See the API Reference for detailed information about the available functions and classes.

License

MIT

For more information or help

No tools information available.

MCP Server for Wayland

computer-vision computer-use
View Details

A Model Context Protocol (MCP) server providing tools to read, search, and manipulate OpenFGA stores programmatically via Large Language Models (LLMs.)

automation mcp
View Details

Instead of dumping 100+ tools into a model’s prompt and expecting it to choose wisely, the Unified MCP Tool Graph equips your LLM with structure, clarity, and relevance. It fixes tool confusion, prevents infinite loops, and enables modular, intelligent agent workflows.

automation ai
View Details

SentinelCore is an advanced AI agent powered by Model Context Protocol. It can browse the web, interact with local file systems, and is designed to keep evolving with new features. Whether you're looking for a smart assistant, a system manager, or a knowledge guide, SentinelCore adapts to your needs seamlessly.

automation agentic-ai
View Details