Wayback Machine Domain Snapshot Extractor
Created by
Stas Persiianenko
Export archived snapshots for a domain from the Wayback Machine CDX API with timestamps, status codes, MIME types, digests, and replay URLs.
Wayback Machine CDX Bulk Extractorautomation-lab/wayback-machine-cdx-extractor
Original URL
Timestamp
Status Code
MIME Type
+4 fieldsTextNumberBooleanListObject
Input
URL or domain(required):example.com
Match type:domain
Max snapshots:1000
From date (YYYYMMDD):20200101
To date (YYYYMMDD):20251231
Filter by status codes
Exclude status codes
Filter by MIME types
Page size:10000
Collapse duplicates:urlkey
Include Wayback Machine URL:true
Output fields
Original URL
Timestamp
Status Code
MIME Type
Content Digest
Size (bytes)
URL Key
Wayback URL
Sign up on Apify01
Create your Apify account to access the Wayback Machine CDX Bulk Extractor.
Start the run02
The Actor will start running based on the input automatically.
Receive the output03
Monitor the progress in real-time. You will be notified as soon as your dataset is complete and ready for review.
Integrate into your workflow04
The final output is delivered in JSON, CSV, or Excel format, ready to be plugged into your workflow.
