AgentKit Browser Automation

By tmahesh GitHub

agentkit for playwright-mcp server

Overview

What is AgentKit Browser Automation?

AgentKit Browser Automation is a sophisticated framework designed for intelligent web navigation and task execution using a multi-agent system.

How to use AgentKit Browser Automation?

To use this project, clone the repository, install the necessary dependencies, set up your environment variables, and run the Playwright MCP server along with the Inngest CLI.

Key features of AgentKit Browser Automation?

Intelligent task planning that breaks down complex tasks into manageable steps.
State management to track browser state and action results.
Robust error handling and recovery mechanisms.
Comprehensive event logging and monitoring.
Extensible action registry for custom behaviors.
Built-in validation for task completion.
Memory management to maintain context and history of actions.

Use cases of AgentKit Browser Automation?

Automating repetitive web tasks such as form submissions.
Testing web applications by simulating user interactions.
Scraping data from websites efficiently.
Validating web content and functionality.

FAQ from AgentKit Browser Automation?

What are the prerequisites for using this project?

You need Node.js (v14 or higher), npm or yarn, and an OpenAI API key for GPT models.

Is there a community for support?

Yes! You can contribute to the project on GitHub and engage with other users and developers there.

Can I customize the agents?

Absolutely! The framework is designed to be extensible, allowing you to create custom behaviors.

Content

AgentKit Browser Automation

A sophisticated browser automation framework built with AgentKit, featuring a multi-agent system for intelligent web navigation and task execution.

Overview

This project implements a multi-agent system for browser automation, where different agents work together to:

Plan and break down tasks
Navigate web pages
Execute browser actions
Validate results

Architecture (TODO)

The system consists of four specialized agents:

Planning Agent
- Breaks down tasks into actionable steps
- Creates detailed execution plans
- Determines task completion criteria
Navigator Agent
- Determines the next actions to take
- Manages state transitions
- Handles action execution
- Provides detailed logging and feedback
Browser Agent
- Executes browser automation actions
- Interacts with web elements
- Handles page navigation
- Manages browser state
Validation Agent
- Validates task completion
- Verifies results
- Handles error cases
- Provides feedback on success/failure

Features

Intelligent Task Planning: Breaks down complex tasks into manageable steps
State Management: Tracks browser state and action results
Error Handling: Robust error handling and recovery mechanisms
Event System: Comprehensive event logging and monitoring
Flexible Action System: Extensible action registry for custom behaviors
Validation Framework: Built-in validation for task completion
Memory Management: Maintains context and history of actions

Getting Started

Prerequisites

Node.js (v14 or higher)
npm or yarn
OpenAI API key (for GPT models)

Installation

Clone the repository:

git clone https://github.com/tmahesh/playwright-agent.git
cd playwright-agent

Install dependencies:

npm install

Set up environment variables:

cp .env.sample .env
# Edit .env with your OpenAI API key and other configurations

run these commands on diff terminals: index.ts, playwright-mcp, inngest-cli

npx @playwright/mcp@latest --port 8931

npx tsx index.ts

npx inngest-cli@latest dev --no-discovery -u http://localhost:3000/api/inngest -v

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

Acknowledgments

No tools information available.

No content found.