πŸ—‚οΈ Google Cache Viewer β€” Wayback + Archive Alternative avatar

πŸ—‚οΈ Google Cache Viewer β€” Wayback + Archive Alternative

Pricing

Pay per usage

Go to Apify Store
πŸ—‚οΈ Google Cache Viewer β€” Wayback + Archive Alternative

πŸ—‚οΈ Google Cache Viewer β€” Wayback + Archive Alternative

Replaces Google's cached-page view (killed Feb 2024). Queries Wayback Machine + archive.today, returns latest snapshot URL, timestamp, and extracted text content.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

2 days ago

Last modified

Share

Google killed their cache view in February 2024. The cache: search operator, the "Cached" link in search results, and the webcache.googleusercontent.com subdomain β€” all gone. For two decades, Google Cache was how the web retrieved temporarily-dead pages, saw what a site looked like a week ago, and debugged deployment issues.

This actor replaces it with a drop-in lookup against Internet Archive Wayback Machine + archive.today, returning the freshest available snapshot for any URL.

What it does

For every URL you provide, the actor:

  1. Queries Wayback Machine's public "closest snapshot" API
  2. Queries archive.today's /newest/ endpoint (follows redirect chain)
  3. Returns the freshest available snapshot with URL, ISO timestamp, and source
  4. Optionally fetches the snapshot HTML and extracts title + 8K char text content
  5. Emits a stable content hash for change detection

Example

import requests
r = requests.post(
"https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=" + APIFY_TOKEN,
json={
"urls": [
"https://example.com/blog/post-now-deleted",
"https://techcrunch.com/2023/01/01/some-article"
],
"fetchContent": True
}
)
for item in r.json():
if item["found"]:
print(f"{item['url']}")
print(f" Archived: {item['latest_timestamp']} via {item['source']}")
print(f" Title: {item['content_title']}")
print(f" Preview: {item['content_text'][:200]}...")
else:
print(f"{item['url']} β€” NOT ARCHIVED")

Sample output:

https://example.com/blog/post-now-deleted
Archived: 2023-11-04T08:22:17Z via wayback
Title: How We Scaled to 10M Users
Preview: When we hit 10 million monthly users last fall, we learned...

cURL

curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"urls":["https://example.com/"],"fetchContent":true}'

Why this replaces Google Cache

Google Cache (dead)This actor
StatusShut down Feb 2024Active
Accesscache:URL operator / webcache.googleusercontent.comHTTPS API
FreshnessLast Google crawlLast Wayback/archive.today snapshot (minutes to months)
Bulk modeManual, one URL at a time200 URLs/run
Text extraction❌ (raw HTML)βœ… (8K char cleaned text)
Machine-readableβŒβœ… (JSON)
CostFree$0.003 per URL

Common use cases

  • Dead-link recovery β€” find the last archived version of a page that 404'd
  • SEO audits β€” see what a competitor's page used to say before they rewrote it
  • Journalism / OSINT β€” pull the text of pages that were deleted after publication
  • Legal / compliance β€” document what a contract/terms page said on a given date
  • Content monitoring β€” track if an important page changed (via content_hash)
  • Affiliate link repair β€” bulk lookup of product pages that were removed

Pricing

  • $0.005 per run (startup)
  • $0.003 per URL looked up (includes content extraction when requested)

100 URLs with content extraction = $0.305. Cheaper than Screaming Frog's archive plugin and no subscription.

FAQ

Q: Does archive.today always have the page? A: Not always. Wayback is broader; archive.today often has freshness Wayback doesn't. The actor queries both and returns the fresher of the two.

Q: What if neither has it? A: Returns found: false. Can't conjure pages that were never archived.

Q: Does this trigger a new archive capture? A: No β€” read-only. To create a fresh capture, use Wayback's Save Page Now endpoint separately (your request, not ours).

Q: Rate limits? A: Wayback rate-limits shared usage at about 1 request/second per IP. This actor paces accordingly β€” expect ~1 URL/second.

Q: How old can snapshots be? A: Wayback has archives dating to 1996. For any URL with a public history, you'll likely find something.

Try it

πŸ—‚οΈ Google Cache Viewer on Apify

New to Apify? Get free platform credits.