๐๏ธ Google Cache Viewer โ Wayback + Archive Alternative
Pricing
Pay per event
๐๏ธ Google Cache Viewer โ Wayback + Archive Alternative
Replaces Google's cached-page view (killed Feb 2024). Queries Wayback Machine + archive.today, returns latest snapshot URL, timestamp, and extracted text content.
Pricing
Pay per event
Rating
0.0
(0)
Developer
NexGenData
Maintained by CommunityActor stats
0
Bookmarked
9
Total users
2
Monthly active users
10 days ago
Last modified
Categories
Share
Google killed their cache view in February 2024. The cache: search operator, the "Cached" link in search results, and the webcache.googleusercontent.com subdomain โ all gone. For two decades, Google Cache was how the web retrieved temporarily-dead pages, saw what a site looked like a week ago, and debugged deployment issues.
This actor replaces it with a drop-in lookup against Internet Archive Wayback Machine + archive.today, returning the freshest available snapshot for any URL.
What it does
For every URL you provide, the actor:
- Queries Wayback Machine's public "closest snapshot" API
- Queries archive.today's
/newest/endpoint (follows redirect chain) - Returns the freshest available snapshot with URL, ISO timestamp, and source
- Optionally fetches the snapshot HTML and extracts title + 8K char text content
- Emits a stable content hash for change detection
Example
import requestsr = requests.post("https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=" + APIFY_TOKEN,json={"urls": ["https://example.com/blog/post-now-deleted","https://techcrunch.com/2023/01/01/some-article"],"fetchContent": True})for item in r.json():if item["found"]:print(f"{item['url']}")print(f" Archived: {item['latest_timestamp']} via {item['source']}")print(f" Title: {item['content_title']}")print(f" Preview: {item['content_text'][:200]}...")else:print(f"{item['url']} โ NOT ARCHIVED")
Sample output:
https://example.com/blog/post-now-deletedArchived: 2023-11-04T08:22:17Z via waybackTitle: How We Scaled to 10M UsersPreview: When we hit 10 million monthly users last fall, we learned...
cURL
curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"urls":["https://example.com/"],"fetchContent":true}'
Why this replaces Google Cache
| Google Cache (dead) | This actor | |
|---|---|---|
| Status | Shut down Feb 2024 | Active |
| Access | cache:URL operator / webcache.googleusercontent.com | HTTPS API |
| Freshness | Last Google crawl | Last Wayback/archive.today snapshot (minutes to months) |
| Bulk mode | Manual, one URL at a time | 200 URLs/run |
| Text extraction | โ (raw HTML) | โ (8K char cleaned text) |
| Machine-readable | โ | โ (JSON) |
| Cost | Free | $0.003 per URL |
Common use cases
- Dead-link recovery โ find the last archived version of a page that 404'd
- SEO audits โ see what a competitor's page used to say before they rewrote it
- Journalism / OSINT โ pull the text of pages that were deleted after publication
- Legal / compliance โ document what a contract/terms page said on a given date
- Content monitoring โ track if an important page changed (via content_hash)
- Affiliate link repair โ bulk lookup of product pages that were removed
Pricing
- $0.005 per run (startup)
- $0.003 per URL looked up (includes content extraction when requested)
100 URLs with content extraction = $0.305. Cheaper than Screaming Frog's archive plugin and no subscription.
FAQ
Q: Does archive.today always have the page? A: Not always. Wayback is broader; archive.today often has freshness Wayback doesn't. The actor queries both and returns the fresher of the two.
Q: What if neither has it?
A: Returns found: false. Can't conjure pages that were never archived.
Q: Does this trigger a new archive capture? A: No โ read-only. To create a fresh capture, use Wayback's Save Page Now endpoint separately (your request, not ours).
Q: Rate limits? A: Wayback rate-limits shared usage at about 1 request/second per IP. This actor paces accordingly โ expect ~1 URL/second.
Q: How old can snapshots be? A: Wayback has archives dating to 1996. For any URL with a public history, you'll likely find something.
Related tools
Try it
๐๏ธ Google Cache Viewer on Apify
New to Apify? Get free platform credits.
๐ป Code Example โ Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("nexgendata/google-cache-viewer").call(run_input={# Fill in the input shape from the actor's input_schema})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
๐ Code Example โ cURL
curl -X POST "https://api.apify.com/v2/acts/nexgendata~google-cache-viewer/run-sync-get-dataset-items?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{ /* input schema */ }'
โ FAQ
Q: How do I get started? Sign up at apify.com, grab your API token from Settings โ Integrations, and run the actor via the Apify console, API, Python SDK, or any integration (Zapier, Make.com, n8n).
Q: What's the typical cost per run? See the pricing section below. Most runs finish under $0.10 for typical batches.
Q: Is this actor maintained? Yes. NexGenData maintains 165+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get responses within 24 hours.
Q: Can I use the output commercially? Yes โ you own the output data. Check the target site's Terms of Service for any usage restrictions on the scraped content itself.
Q: How do I handle rate limits? Apify manages concurrency and retries automatically. For very large batches (10K+ items), run multiple smaller jobs in parallel instead of one mega-job for better reliability.
๐ฐ Pricing
Pay-per-event pricing โ you only pay for what you actually extract.
- Actor Start: $0.0001
- result: $0.0050
๐ Related NexGenData Actors
๐ Apify Affiliate Program
New to Apify? Sign up with our referral link โ you get free platform credits on signup, and you help fund the maintenance of this actor fleet.
๐ More From NexGenData
Explore the full catalog, tutorials, Gumroad data packs, and newsletter at thenextgennexus.com โ the brand home for everything we ship.
- ๐ Tutorials & how-to guides
- ๐๏ธ Full actor catalog with usage examples
- ๐ฆ Gumroad data packs (one-time purchases)
- ๐ฌ Newsletter โ monthly drops of new actors and revenue experiments
Built and maintained by NexGenData โ 165+ actors covering scraping, enrichment, MCP servers, and automation. ๐ Home: thenextgennexus.com
Why Google Cache Viewer Beats the Wayback Machine, Archive.today, Bing Cache & Cachedview.com
| Feature | NexGenData Google Cache Viewer | Internet Archive Wayback | Archive.today | Bing Cache | Cachedview.com |
|---|---|---|---|---|---|
| Cost | $0.002 per URL, pay-per-event | Free (rate-limited, slow) | Free (rate-limited) | Removed by Microsoft | Web-only (no API) |
| Bulk input | Thousands per run | One per request | One per request | RIP | One per page |
| Google cache fallback | Yes โ webcache.googleusercontent.com while it lasted | No | No | RIP | Was the whole product |
| Wayback fallback | Yes โ closest-to-target-date snapshot | Yes (it IS Wayback) | Partial | RIP | No |
| Archive.today fallback | Yes | No | Yes (it IS them) | RIP | No |
| Structured output | Yes โ JSON with source + retrieved text + timestamp | HTML only | HTML only | RIP | HTML only |
| Schedule + webhook | Native | None | None | RIP | None |
| Monthly minimum | None | None | Donations | RIP | None |
| Auth | Apify token | None | None | RIP | None |
Google deprecated webcache.googleusercontent.com in 2024. Bing Cache was removed years earlier. This actor stitches together every remaining cache source (Wayback Machine, Archive.today, archive.org search) into one bulk pipeline that returns the closest snapshot to a target date, plus extracted text, plus the source URL of the archived copy โ so SEO teams, journalists, and OSINT researchers stop manually pasting URLs into half a dozen archive sites.
Related NexGenData Infrastructure Actors
| Use case | Actor |
|---|---|
| Google CSE search API replacement | google-cse-replacement |
| goo.gl short URL resolver via Wayback | goo-gl-resolver |
| Alexa Rank replacement (site traffic) | alexa-rank-replacement |
| Wappalyzer / BuiltWith tech-stack detector | wappalyzer-replacement |
| Lighthouse + Core Web Vitals auditor | page-speed-analyzer |
| WCAG 2.2 accessibility auditor | wcag-accessibility-auditor |
| Bulk DNS A / MX / NS / TXT / CAA records | dns-records-lookup |
| WHOIS / RDAP replacement (any TLD) | whois-replacement |
| Bulk IP-to-country / city / ISP / ASN | ip-geolocation-replacement |
| Web scraping MCP for AI agents | web-scraping-mcp-server |
Browse the full NexGenData catalog of 260+ actors at https://apify.com/nexgendata?fpr=2ayu9b