What is Wayback Machine MCP Server?
Wayback Machine MCP Server is a Model Context Protocol (MCP) server that provides access to the Internet Archive's Wayback Machine, allowing users to retrieve archived versions of web pages and check available snapshots of URLs.
How to use Wayback Machine MCP Server?
To use the Wayback Machine MCP Server, clone the repository, install dependencies, build the project, and configure it in your MCP settings file. You can then use the provided tools to get snapshots or retrieve archived pages.
Key features of Wayback Machine MCP Server?
- Retrieve a list of available snapshots for a specific URL.
- Access archived web pages from the Internet Archive.
- Flexible parameters for snapshot retrieval, including date ranges and matching types.
Use cases of Wayback Machine MCP Server?
- Checking the historical versions of a website.
- Researching changes in web content over time.
- Accessing content that may no longer be available on the live web.
FAQ from Wayback Machine MCP Server?
- Can I access any webpage from the Wayback Machine?
Yes, as long as the page has been archived, you can access it through the server.
- Is there a limit to the number of snapshots I can retrieve?
You can specify a limit when retrieving snapshots, with a default of 100.
- How do I get the original content without the Wayback Machine banner?
You can set the 'original' parameter to true when retrieving an archived page.
Wayback Machine MCP Server
This is a Model Context Protocol (MCP) server that provides access to the Internet Archive's Wayback Machine. It allows you to retrieve archived versions of web pages and check available snapshots of URLs.
Features
Tools
-
get_snapshots
- Get a list of available snapshots for a URL from the Wayback Machine
- Parameters:
url
(required): URL to check for snapshotsfrom
(optional): Start date in YYYYMMDD formatto
(optional): End date in YYYYMMDD formatlimit
(optional): Maximum number of snapshots to return (default: 100)match_type
(optional): Type of URL matching to use (default: exact)- Options: 'exact', 'prefix', 'host', 'domain'
-
get_archived_page
- Retrieve the content of an archived webpage from the Wayback Machine
- Parameters:
url
(required): URL of the page to retrievetimestamp
(required): Timestamp in YYYYMMDDHHMMSS formatoriginal
(optional): Whether to get the original content without Wayback Machine banner (default: false)
Resource Templates
- wayback://{url}/{timestamp}
- Access archived web pages from the Internet Archive Wayback Machine
- Parameters:
url
: The webpage URL to retrievetimestamp
: The specific archive timestamp (YYYYMMDDHHMMSS format)
Installation
- Clone this repository
- Install dependencies:
npm install
- Build the project:
npm run build
- Add the server to your MCP settings file:
{
"mcpServers": {
"wayback-machine": {
"command": "node",
"args": ["/path/to/wayback-server/build/index.js"],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}
Usage Examples
Get Snapshots
use_mcp_tool(
server_name="wayback-machine",
tool_name="get_snapshots",
arguments={
"url": "example.com",
"from": "20200101",
"to": "20201231",
"limit": 10
}
)
Get Archived Page
use_mcp_tool(
server_name="wayback-machine",
tool_name="get_archived_page",
arguments={
"url": "example.com",
"timestamp": "20200101120000",
"original": true
}
)
Access Resource
access_mcp_resource(
server_name="wayback-machine",
uri="wayback://example.com/20200101120000"
)
API Details
This server uses the following Wayback Machine APIs:
- Availability API:
https://archive.org/wayback/available?url={url}
- CDX Server API:
https://web.archive.org/cdx/search/cdx?url={url}&output=json
- Wayback Machine Memento API:
https://web.archive.org/web/{timestamp}/{url}
License
ISC