๐๏ธ Google Cache Viewer โ Wayback + Archive Alternative
Pricing
Pay per event
๐๏ธ Google Cache Viewer โ Wayback + Archive Alternative
Replaces Google's cached-page view (killed Feb 2024). Queries Wayback Machine + archive.today, returns latest snapshot URL, timestamp, and extracted text content.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stephan Corbeil
Actor stats
0
Bookmarked
5
Total users
3
Monthly active users
16 days ago
Last modified
Categories
Share
Google killed their cache view in February 2024. The cache: search operator, the "Cached" link in search results, and the webcache.googleusercontent.com subdomain โ all gone. For two decades, Google Cache was how the web retrieved temporarily-dead pages, saw what a site looked like a week ago, and debugged deployment issues.
This actor replaces it with a drop-in lookup against Internet Archive Wayback Machine + archive.today, returning the freshest available snapshot for any URL.
What it does
For every URL you provide, the actor:
- Queries Wayback Machine's public "closest snapshot" API
- Queries archive.today's
/newest/endpoint (follows redirect chain) - Returns the freshest available snapshot with URL, ISO timestamp, and source
- Optionally fetches the snapshot HTML and extracts title + 8K char text content
- Emits a stable content hash for change detection
Example
import requestsr = requests.post("https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=" + APIFY_TOKEN,json={"urls": ["https://example.com/blog/post-now-deleted","https://techcrunch.com/2023/01/01/some-article"],"fetchContent": True})for item in r.json():if item["found"]:print(f"{item['url']}")print(f" Archived: {item['latest_timestamp']} via {item['source']}")print(f" Title: {item['content_title']}")print(f" Preview: {item['content_text'][:200]}...")else:print(f"{item['url']} โ NOT ARCHIVED")
Sample output:
https://example.com/blog/post-now-deletedArchived: 2023-11-04T08:22:17Z via waybackTitle: How We Scaled to 10M UsersPreview: When we hit 10 million monthly users last fall, we learned...
cURL
curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"urls":["https://example.com/"],"fetchContent":true}'
Why this replaces Google Cache
| Google Cache (dead) | This actor | |
|---|---|---|
| Status | Shut down Feb 2024 | Active |
| Access | cache:URL operator / webcache.googleusercontent.com | HTTPS API |
| Freshness | Last Google crawl | Last Wayback/archive.today snapshot (minutes to months) |
| Bulk mode | Manual, one URL at a time | 200 URLs/run |
| Text extraction | โ (raw HTML) | โ (8K char cleaned text) |
| Machine-readable | โ | โ (JSON) |
| Cost | Free | $0.003 per URL |
Common use cases
- Dead-link recovery โ find the last archived version of a page that 404'd
- SEO audits โ see what a competitor's page used to say before they rewrote it
- Journalism / OSINT โ pull the text of pages that were deleted after publication
- Legal / compliance โ document what a contract/terms page said on a given date
- Content monitoring โ track if an important page changed (via content_hash)
- Affiliate link repair โ bulk lookup of product pages that were removed
Pricing
- $0.005 per run (startup)
- $0.003 per URL looked up (includes content extraction when requested)
100 URLs with content extraction = $0.305. Cheaper than Screaming Frog's archive plugin and no subscription.
FAQ
Q: Does archive.today always have the page? A: Not always. Wayback is broader; archive.today often has freshness Wayback doesn't. The actor queries both and returns the fresher of the two.
Q: What if neither has it?
A: Returns found: false. Can't conjure pages that were never archived.
Q: Does this trigger a new archive capture? A: No โ read-only. To create a fresh capture, use Wayback's Save Page Now endpoint separately (your request, not ours).
Q: Rate limits? A: Wayback rate-limits shared usage at about 1 request/second per IP. This actor paces accordingly โ expect ~1 URL/second.
Q: How old can snapshots be? A: Wayback has archives dating to 1996. For any URL with a public history, you'll likely find something.
Related tools
Try it
๐๏ธ Google Cache Viewer on Apify
New to Apify? Get free platform credits.
๐ป Code Example โ Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("nexgendata/google-cache-viewer").call(run_input={# Fill in the input shape from the actor's input_schema})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
๐ Code Example โ cURL
curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{ /* input schema */ }'
โ FAQ
Q: How do I get started? Sign up at apify.com, grab your API token from Settings โ Integrations, and run the actor via the Apify console, API, Python SDK, or any integration (Zapier, Make.com, n8n).
Q: What's the typical cost per run? See the pricing section below. Most runs finish under $0.10 for typical batches.
Q: Is this actor maintained? Yes. NexGenData maintains 165+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get responses within 24 hours.
Q: Can I use the output commercially? Yes โ you own the output data. Check the target site's Terms of Service for any usage restrictions on the scraped content itself.
Q: How do I handle rate limits? Apify manages concurrency and retries automatically. For very large batches (10K+ items), run multiple smaller jobs in parallel instead of one mega-job for better reliability.
๐ฐ Pricing
Pay-per-event pricing โ you only pay for what you actually extract.
- Actor Start: $0.0001
- result: $0.0050
๐ Related NexGenData Actors
๐ Apify Affiliate Program
New to Apify? Sign up with our referral link โ you get free platform credits on signup, and you help fund the maintenance of this actor fleet.
๐ More From NexGenData
Explore the full catalog, tutorials, Gumroad data packs, and newsletter at thenextgennexus.com โ the brand home for everything we ship.
- ๐ Tutorials & how-to guides
- ๐๏ธ Full actor catalog with usage examples
- ๐ฆ Gumroad data packs (one-time purchases)
- ๐ฌ Newsletter โ monthly drops of new actors and revenue experiments
Built and maintained by NexGenData โ 165+ actors covering scraping, enrichment, MCP servers, and automation. ๐ Home: thenextgennexus.com
