Wayback Machine Checker avatar

Wayback Machine Checker

Pricing

Pay per event

Go to Apify Store
Wayback Machine Checker

Wayback Machine Checker

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Check Internet Archive Wayback Machine availability and snapshot history for any list of URLs.

What does Wayback Machine Checker do?

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results, giving you a full picture of each URL's archive history. Process entire lists of domains or pages at once to quickly assess web presence history across your portfolio.

Use cases

  • Domain investors -- check website history and archive age before purchasing a domain to assess its legitimacy
  • Content recovery specialists -- find archived versions of deleted or lost web pages and retrieve their content
  • Historians and researchers -- study how websites evolved over time with timestamped snapshots spanning decades
  • SEO professionals -- find broken pages with archived content for link reclamation and redirect opportunities
  • Journalists -- verify past claims by accessing archived versions of news articles, press releases, and public statements
  • Legal teams -- document web page history for intellectual property disputes or compliance investigations
  • Digital archivists -- audit which pages from a collection are preserved in the Wayback Machine and which are missing

Why use Wayback Machine Checker?

  • Batch processing -- check hundreds of URLs against the Wayback Machine in a single run instead of searching manually
  • Dual API approach -- uses both the Availability API and CDX API for more complete and reliable results than either alone
  • Structured output -- returns snapshot counts, dates, and direct archive URLs in clean JSON ready for analysis
  • Age calculation -- automatically computes how many years a URL has been archived, useful for domain valuation
  • Direct snapshot links -- provides clickable URLs to the oldest archived version so you can view historical content immediately
  • API and schedule ready -- automate archive checks via the Apify API or scheduled runs for ongoing monitoring
  • Pay-per-event pricing -- only pay for the URLs you check, starting at $0.002 per URL

Input parameters

ParameterTypeRequiredDefaultDescription
urlsarrayYes--List of URLs to check on the Wayback Machine

You can provide any publicly accessible URL, including deep subpages, not just root domains. Each URL is checked independently against the Internet Archive APIs.

Input example

{
"urls": [
"https://www.google.com",
"https://www.wikipedia.org",
"https://example.com"
]
}

Output example

Each result includes the URL, availability status, snapshot count, date range, a direct link to the oldest snapshot, and the computed archive age in years.

{
"url": "https://www.google.com",
"isAvailable": true,
"snapshotCount": 10000,
"oldestSnapshot": "1998-12-02",
"newestSnapshot": "2026-02-28",
"oldestSnapshotUrl": "https://web.archive.org/web/19981202230410/https://www.google.com",
"firstArchiveYear": 1998,
"archiveAgeYears": 27.2,
"error": null,
"checkedAt": "2026-03-01T12:00:00.000Z"
}

How much does it cost?

Wayback Machine Checker uses Apify's pay-per-event pricing model. You are only charged for what you actually use -- no monthly fees, no subscriptions.

EventPriceDescription
Start$0.035One-time per run
URL checked$0.002Per URL checked

Cost examples:

  • Checking 10 URLs: $0.035 + (10 x $0.002) = $0.055
  • Checking 50 URLs: $0.035 + (50 x $0.002) = $0.135
  • Checking 100 URLs: $0.035 + (100 x $0.002) = $0.235
  • Checking 1,000 URLs: $0.035 + (1,000 x $0.002) = $2.035

Using the Apify API

You can call Wayback Machine Checker programmatically from any language using the Apify API. The actor slug is automation-lab/wayback-machine-checker. Below are ready-to-use examples for the two most common languages.

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('automation-lab/wayback-machine-checker').call({
urls: ['https://www.google.com', 'https://www.wikipedia.org'],
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('automation-lab/wayback-machine-checker').call(run_input={
'urls': ['https://www.google.com', 'https://www.wikipedia.org'],
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

Integrations

Wayback Machine Checker integrates with the major automation and data platforms through the Apify ecosystem:

  • Make (formerly Integromat) -- trigger archive checks automatically when new domains appear in your pipeline or CRM.
  • Zapier -- create Zaps that check Wayback Machine availability whenever a new URL is added to a list.
  • Google Sheets -- send results to a spreadsheet for tracking domain archive history over time.
  • Slack -- alert your team when a domain has no archive history or when a previously archived page disappears.
  • Webhooks -- post-process results in your own backend for domain valuation or research workflows.
  • n8n -- orchestrate runs from n8n workflows or any platform that supports HTTP requests and the Apify REST API.

Tips and best practices

  • Use full URLs -- include the protocol (https://) for the most accurate results from the Wayback Machine APIs.
  • Check before buying domains -- a long archive history with legitimate content is a positive signal for domain valuation and SEO potential.
  • Combine with other actors -- pair with Website Uptime Checker to see if a site is live now and how long it has been archived.
  • Schedule periodic checks -- set up a weekly schedule to monitor whether important pages continue to be archived over time.
  • Use snapshot URLs directly -- the oldestSnapshotUrl field gives you a direct link you can open in a browser to view the archived page.
  • Batch domains for due diligence -- when evaluating multiple domains for acquisition, run them all in a single batch to compare archive histories side by side.

FAQ

What if a URL has never been archived? The actor returns isAvailable: false with snapshotCount: 0 and null values for the snapshot date fields. The error field remains null because the check itself succeeded.

Does the actor create new Wayback Machine snapshots? No. It only queries existing snapshots through the Internet Archive APIs. It does not trigger new crawls or submit URLs for archiving. To request a new snapshot, use the Wayback Machine's Save Page Now feature directly.

Are there rate limits on the Wayback Machine API? The Internet Archive may throttle requests if too many are sent in a short period. The actor handles this gracefully with automatic retries and pacing to stay within acceptable limits.

Can I export results to CSV? Yes. Apify datasets support export in JSON, CSV, Excel, XML, and other formats. After the run completes, download results from the Apify Console or use the API to export in your preferred format.

Can I check subpages or just root domains? You can check any full URL, including deep subpages. The Wayback Machine archives individual pages, so https://example.com/blog/post-1 and https://example.com are tracked separately with their own snapshot histories.

What does the archiveAgeYears field represent? It is the number of years between the oldest snapshot date and the current date, calculated as a decimal. For example, 27.2 means the URL has been archived for approximately 27 years and 2 months.