๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative avatar

๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative

Pricing

Pay per event

Go to Apify Store
๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative

๐Ÿ—‚๏ธ Google Cache Viewer โ€” Wayback + Archive Alternative

Replaces Google's cached-page view (killed Feb 2024). Queries Wayback Machine + archive.today, returns latest snapshot URL, timestamp, and extracted text content.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stephan Corbeil

Stephan Corbeil

Maintained by Community

Actor stats

0

Bookmarked

5

Total users

3

Monthly active users

16 days ago

Last modified

Share

Google killed their cache view in February 2024. The cache: search operator, the "Cached" link in search results, and the webcache.googleusercontent.com subdomain โ€” all gone. For two decades, Google Cache was how the web retrieved temporarily-dead pages, saw what a site looked like a week ago, and debugged deployment issues.

This actor replaces it with a drop-in lookup against Internet Archive Wayback Machine + archive.today, returning the freshest available snapshot for any URL.

What it does

For every URL you provide, the actor:

  1. Queries Wayback Machine's public "closest snapshot" API
  2. Queries archive.today's /newest/ endpoint (follows redirect chain)
  3. Returns the freshest available snapshot with URL, ISO timestamp, and source
  4. Optionally fetches the snapshot HTML and extracts title + 8K char text content
  5. Emits a stable content hash for change detection

Example

import requests
r = requests.post(
"https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=" + APIFY_TOKEN,
json={
"urls": [
"https://example.com/blog/post-now-deleted",
"https://techcrunch.com/2023/01/01/some-article"
],
"fetchContent": True
}
)
for item in r.json():
if item["found"]:
print(f"{item['url']}")
print(f" Archived: {item['latest_timestamp']} via {item['source']}")
print(f" Title: {item['content_title']}")
print(f" Preview: {item['content_text'][:200]}...")
else:
print(f"{item['url']} โ€” NOT ARCHIVED")

Sample output:

https://example.com/blog/post-now-deleted
Archived: 2023-11-04T08:22:17Z via wayback
Title: How We Scaled to 10M Users
Preview: When we hit 10 million monthly users last fall, we learned...

cURL

curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"urls":["https://example.com/"],"fetchContent":true}'

Why this replaces Google Cache

Google Cache (dead)This actor
StatusShut down Feb 2024Active
Accesscache:URL operator / webcache.googleusercontent.comHTTPS API
FreshnessLast Google crawlLast Wayback/archive.today snapshot (minutes to months)
Bulk modeManual, one URL at a time200 URLs/run
Text extractionโŒ (raw HTML)โœ… (8K char cleaned text)
Machine-readableโŒโœ… (JSON)
CostFree$0.003 per URL

Common use cases

  • Dead-link recovery โ€” find the last archived version of a page that 404'd
  • SEO audits โ€” see what a competitor's page used to say before they rewrote it
  • Journalism / OSINT โ€” pull the text of pages that were deleted after publication
  • Legal / compliance โ€” document what a contract/terms page said on a given date
  • Content monitoring โ€” track if an important page changed (via content_hash)
  • Affiliate link repair โ€” bulk lookup of product pages that were removed

Pricing

  • $0.005 per run (startup)
  • $0.003 per URL looked up (includes content extraction when requested)

100 URLs with content extraction = $0.305. Cheaper than Screaming Frog's archive plugin and no subscription.

FAQ

Q: Does archive.today always have the page? A: Not always. Wayback is broader; archive.today often has freshness Wayback doesn't. The actor queries both and returns the fresher of the two.

Q: What if neither has it? A: Returns found: false. Can't conjure pages that were never archived.

Q: Does this trigger a new archive capture? A: No โ€” read-only. To create a fresh capture, use Wayback's Save Page Now endpoint separately (your request, not ours).

Q: Rate limits? A: Wayback rate-limits shared usage at about 1 request/second per IP. This actor paces accordingly โ€” expect ~1 URL/second.

Q: How old can snapshots be? A: Wayback has archives dating to 1996. For any URL with a public history, you'll likely find something.

Try it

๐Ÿ—‚๏ธ Google Cache Viewer on Apify

New to Apify? Get free platform credits.

๐Ÿ’ป Code Example โ€” Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/google-cache-viewer").call(run_input={
# Fill in the input shape from the actor's input_schema
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

๐ŸŒ Code Example โ€” cURL

curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{ /* input schema */ }'

โ“ FAQ

Q: How do I get started? Sign up at apify.com, grab your API token from Settings โ†’ Integrations, and run the actor via the Apify console, API, Python SDK, or any integration (Zapier, Make.com, n8n).

Q: What's the typical cost per run? See the pricing section below. Most runs finish under $0.10 for typical batches.

Q: Is this actor maintained? Yes. NexGenData maintains 165+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get responses within 24 hours.

Q: Can I use the output commercially? Yes โ€” you own the output data. Check the target site's Terms of Service for any usage restrictions on the scraped content itself.

Q: How do I handle rate limits? Apify manages concurrency and retries automatically. For very large batches (10K+ items), run multiple smaller jobs in parallel instead of one mega-job for better reliability.

๐Ÿ’ฐ Pricing

Pay-per-event pricing โ€” you only pay for what you actually extract.

  • Actor Start: $0.0001
  • result: $0.0050

๐Ÿš€ Apify Affiliate Program

New to Apify? Sign up with our referral link โ€” you get free platform credits on signup, and you help fund the maintenance of this actor fleet.

๐Ÿ“š More From NexGenData

Explore the full catalog, tutorials, Gumroad data packs, and newsletter at thenextgennexus.com โ€” the brand home for everything we ship.

  • ๐Ÿ“– Tutorials & how-to guides
  • ๐Ÿ—‚๏ธ Full actor catalog with usage examples
  • ๐Ÿ“ฆ Gumroad data packs (one-time purchases)
  • ๐Ÿ“ฌ Newsletter โ€” monthly drops of new actors and revenue experiments

Built and maintained by NexGenData โ€” 165+ actors covering scraping, enrichment, MCP servers, and automation. ๐Ÿ  Home: thenextgennexus.com