Wayback Machine MCP Server

By Cyreslab-AI GitHub

wayback-mcp wayback-machine

Overview

What is Wayback Machine MCP Server?

Wayback Machine MCP Server is a Model Context Protocol (MCP) server that provides access to the Internet Archive's Wayback Machine, allowing users to retrieve archived versions of web pages and check available snapshots of URLs.

How to use Wayback Machine MCP Server?

To use the Wayback Machine MCP Server, clone the repository, install dependencies, build the project, and configure it in your MCP settings file. You can then use the provided tools to get snapshots or retrieve archived pages.

Key features of Wayback Machine MCP Server?

Retrieve a list of available snapshots for a specific URL.
Access archived web pages from the Internet Archive.
Flexible parameters for snapshot retrieval, including date ranges and matching types.

Use cases of Wayback Machine MCP Server?

Checking the historical versions of a website.
Researching changes in web content over time.
Accessing content that may no longer be available on the live web.

FAQ from Wayback Machine MCP Server?

Can I access any webpage from the Wayback Machine?

Yes, as long as the page has been archived, you can access it through the server.

Is there a limit to the number of snapshots I can retrieve?

You can specify a limit when retrieving snapshots, with a default of 100.

How do I get the original content without the Wayback Machine banner?

You can set the 'original' parameter to true when retrieving an archived page.

Content

Wayback Machine MCP Server

This is a Model Context Protocol (MCP) server that provides access to the Internet Archive's Wayback Machine. It allows you to retrieve archived versions of web pages and check available snapshots of URLs.

Features

Tools

get_snapshots
- Get a list of available snapshots for a URL from the Wayback Machine
- Parameters:
  - url (required): URL to check for snapshots
  - from (optional): Start date in YYYYMMDD format
  - to (optional): End date in YYYYMMDD format
  - limit (optional): Maximum number of snapshots to return (default: 100)
  - match_type (optional): Type of URL matching to use (default: exact)
    - Options: 'exact', 'prefix', 'host', 'domain'
get_archived_page
- Retrieve the content of an archived webpage from the Wayback Machine
- Parameters:
  - url (required): URL of the page to retrieve
  - timestamp (required): Timestamp in YYYYMMDDHHMMSS format
  - original (optional): Whether to get the original content without Wayback Machine banner (default: false)

Resource Templates

wayback://{url}/{timestamp}
- Access archived web pages from the Internet Archive Wayback Machine
- Parameters:
  - url: The webpage URL to retrieve
  - timestamp: The specific archive timestamp (YYYYMMDDHHMMSS format)

Installation

Clone this repository
Install dependencies: npm install
Build the project: npm run build
Add the server to your MCP settings file:

{
  "mcpServers": {
    "wayback-machine": {
      "command": "node",
      "args": ["/path/to/wayback-server/build/index.js"],
      "env": {},
      "disabled": false,
      "autoApprove": []
    }
  }
}

Usage Examples

Get Snapshots

use_mcp_tool(
  server_name="wayback-machine",
  tool_name="get_snapshots",
  arguments={
    "url": "example.com",
    "from": "20200101",
    "to": "20201231",
    "limit": 10
  }
)

Get Archived Page

use_mcp_tool(
  server_name="wayback-machine",
  tool_name="get_archived_page",
  arguments={
    "url": "example.com",
    "timestamp": "20200101120000",
    "original": true
  }
)

Access Resource

access_mcp_resource(
  server_name="wayback-machine",
  uri="wayback://example.com/20200101120000"
)

API Details

This server uses the following Wayback Machine APIs:

Availability API: https://archive.org/wayback/available?url={url}
CDX Server API: https://web.archive.org/cdx/search/cdx?url={url}&output=json
Wayback Machine Memento API: https://web.archive.org/web/{timestamp}/{url}

License

ISC

No tools information available.

No content found.