Website Screenshot — Full Pages, Any Resolution, PNG, No Limits
Pricing
Pay per usage
Website Screenshot — Full Pages, Any Resolution, PNG, No Limits
20 runs. Website screenshots as PNG/JPG/PDF in 2 min — full-page, desktop + mobile, custom viewport, bulk URL input. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For competitor visual tracking + UX research. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Alex
Actor stats
0
Bookmarked
3
Total users
0
Monthly active users
10 days ago
Last modified
Categories
Share
Website Screenshot Scraper — Playwright PNG/JPEG Capture, Custom Viewport
Capture batches of webpage screenshots — full-page or viewport-only, PNG or JPEG, custom width/height — to an Apify key-value store. Zero local browser install, zero Playwright boilerplate.
Headless Chromium (via Playwright) loads the URL with domcontentloaded waiting strategy, optionally waits for a CSS selector, captures the screenshot, stores it in the run's key-value store, and pushes one dataset record per URL with the signed image URL plus capture metadata.
What you actually get (verified against src/main.js)
Output schema — one record per URL
{"url": "https://stripe.com","title": "Stripe | Financial Infrastructure for the Internet","screenshotKey": "screenshot_stripe_com_1714398900000","screenshotUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/<screenshotKey>","format": "png","width": 1280,"height": 720,"fullPage": false,"fileSize": 286410,"scrapedAt": "2026-04-29T12:00:00.000Z"}
10 fields per success record. On error, the actor pushes { url, error: "<reason>", scrapedAt } so your downstream pipeline can retry the failures selectively.
screenshotUrl points to the file inside the Apify run's default key-value store. The store is retained per Apify's plan defaults (typically 14 days on free tier; longer on paid). For permanent retention, post-process the run via Apify webhook → S3 / R2 / your own object store.
Input (full schema, all 8 fields exposed in UI)
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
urls | array | [] | required, ≥1 | List of URLs. Plain hostnames (stripe.com) get https:// prepended automatically. |
fullPage | boolean | false | — | true for entire scroll height; false for viewport-only. |
width | integer | 1280 | 320–3840 | Browser viewport width in pixels. |
height | integer | 720 | 240–2160 | Browser viewport height in pixels. |
format | string | "png" | "png" | "jpeg" | Output format. |
quality | integer | 80 | 1–100 | JPEG quality. Ignored when format="png". |
waitForSelector | string | "" | CSS selector | Optional CSS selector — actor waits up to 10 s for it before capturing. Empty = skip. |
waitTime | integer | 2000 | 0–30000 ms | Extra delay after load before capture. Note: a hardcoded 2000 ms settle ALSO runs before this — total minimum settle = 2000 + waitTime (default total = 4000 ms). |
How it works
- Launch headless Chromium (Playwright
chromium.launch({ headless: true })). - New browser context with the requested
width × heightviewport. - For each URL:
page.goto(url, { waitUntil: 'domcontentloaded', timeout: 45000 }).page.waitForLoadState('load', { timeout: 15000 })— best-effort, swallows timeout.page.waitForTimeout(2000)— settles late-paint elements.- If
waitForSelector:page.waitForSelector(selector, { timeout: 10000 }), swallows timeout. - Additional
page.waitForTimeout(waitTime)ifwaitTime > 0. page.screenshot({ fullPage, type: format, quality? }).- Save buffer to KV store as
screenshot_<domain>_<epochMs>. - Push one dataset record.
Why not networkidle? Pages with persistent SSE / WebSockets / live analytics (Stripe, Linear, Vercel) never reach networkidle — Playwright would time out on them. The actor explicitly uses domcontentloaded + a soft load wait + a fixed waitTime to handle late-paint reliably.
Honest limitations (read before bulk runs)
- Total minimum settle =
2000 ms+waitTime. There's a hardcodedpage.waitForTimeout(2000)AFTERdomcontentloadedand BEFORE the configurablewaitTime. Default total is 2000 + 2000 = 4000 ms of fixed delay per URL on top of network time. For very-fast pages this is overkill; for very-slow SPA shells it may still be too short — adjustwaitTime. - Single browser context, sequential URL processing. The actor opens ONE Chromium context and processes URLs in a
forloop. 100 URLs × ~6 s wall-clock each ≈ 10 min. No parallelism. - One outer try/catch wraps browser launch only — per-URL errors are caught. If a single URL fails (timeout, DNS, navigation error), the actor pushes
{ url, error, scrapedAt }and CONTINUES to the next URL. However, if the browser itself crashes mid-batch, the whole run aborts (no auto-relaunch). - Cloudflare Turnstile / hCaptcha / anti-bot walls block the actor. Standard headless Chromium fingerprint — no stealth plugins. Cloudflare will challenge or block; expect either an error record or a screenshot of the challenge page.
- No login / cookie injection. Fresh browser context per run. Pages behind auth render their pre-login state. Login-walled captures = custom build.
- No element-crop, no auto-scroll for lazy-load images, no banner-dismissal heuristics. Full-page captures of cookie-banner-heavy sites will show the banner overlay. Custom build can dismiss common banners (Cookiebot, OneTrust, Quantcast).
- No proxy. Direct browser launch on Apify worker IP. Geo-restricted pages render with worker's region (typically US/EU).
- Screenshot retention is per-Apify-plan default — typically 14 days on free tier, longer on paid. For permanent retention, copy via webhook to your own S3 / R2 / Backblaze.
- Filename is timestamp-keyed
screenshot_<domain>_<epochMs>— repeated captures of the same URL produce DIFFERENT keys (no overwrite). Useful for archival; means key-value store grows unbounded — manage retention yourself. titleis pagedocument.titleAT capture time — for SPAs the title may still be the shell's default if hydration hasn't completed within the 4 s settle window.waitForSelectortimeout (10 s) is silent — if the selector never appears, the actor proceeds with the current page state (caught.catch(() => {})).urls = []is silently accepted — actor exits without pushing any records (only browser launch logs).
Who buys this actor
- Visual-regression QA engineers running nightly screenshot diffs against staging + prod to catch CSS regressions before users do.
- Competitive-intel teams archiving weekly snapshots of competitor landing pages (pricing, feature lists, hero copy) for deal-review decks.
- Content archival / journalism preserving webpage state for takedown resilience (source of truth when a page later changes or 404s).
- Link-preview / OG-image fallback services generating thumbnail cards for social feeds when the upstream page lacks proper
og:imagetags. - Brand / trademark monitoring capturing how your logo or copy is displayed on partner, affiliate, and unauthorized resale sites.
- MCP / LLM-agent tools giving an agent the ability to "see" a webpage when DOM-only context isn't enough.
Python example — visual-regression diff
Capture the same set of paths twice (staging + prod) and flag any byte-size delta >5%:
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")pages = ["/", "/pricing", "/docs", "/blog", "/login"]def capture(base_url: str) -> dict[str, dict]:run = client.actor("knotless_cadence/website-screenshot-scraper").call(run_input={"urls": [base_url + p for p in pages],"fullPage": True,"width": 1440,"height": 900,"format": "png",})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())return {i["url"].replace(base_url, ""): i for i in items}staging = capture("https://staging.example.com")prod = capture("https://www.example.com")for path in pages:s = staging.get(path, {}).get("fileSize", 0)p = prod.get(path, {}).get("fileSize", 0)if p and abs(s - p) / p > 0.05:print(f"⚠ {path} diff {((s-p)/p)*100:+.1f}% staging={s}B prod={p}B")print(f" {staging[path]['screenshotUrl']}")print(f" {prod[path]['screenshotUrl']}")
MCP / LLM-agent integration
tools = [{"name": "capture_webpage","description": "Take a screenshot of a webpage and return the image URL.","input_schema": {"type": "object","properties": {"url": {"type": "string"},"fullPage": {"type": "boolean", "default": False},},"required": ["url"],},}]def capture_webpage(url: str, full_page: bool = False) -> str:run = client.actor("knotless_cadence/website-screenshot-scraper").call(run_input={"urls": [url], "fullPage": full_page, "format": "png",})return list(client.dataset(run["defaultDatasetId"]).iterate_items())[0]["screenshotUrl"]
Pair with Claude Vision / GPT-4o for accessibility audits, brand-compliance checks, or end-to-end QA that tests "looks right" not just "DOM matches".
Common questions
Q: Can I capture a specific element instead of the whole page? A: Not in this actor. Workaround: capture full page, crop locally with Pillow / sharp using the element's bounding box from a companion DOM query. Available as a custom build (see Custom scraping below).
Q: How do I get screenshots at multiple breakpoints (320, 768, 1440 px) in one run?
A: Call the actor 3 times with different width. Native multi-viewport input is on the roadmap but not implemented yet.
Q: What about pages behind a login wall? A: Not supported in v1.0 — the actor uses a fresh browser context per run with no cookie / session injection. Custom build with cookie / session-token injection available on request.
Q: Does this bypass Cloudflare or captchas? A: No. Standard headless Chromium fingerprint. Aggressive bot-protection (Cloudflare Turnstile, hCaptcha) will block the actor.
Q: Can I schedule this nightly? A: Yes — Apify has native cron scheduling. Set the actor to run daily, pipe the output dataset to your webhook / Slack / S3 sync.
Q: How long do screenshots stay accessible? A: Per Apify plan defaults — typically 14 days on free tier, longer on paid. For permanent retention, copy PNGs to your own S3 / R2 / Backblaze bucket via Apify webhook or a post-run script.
Visual / monitoring toolkit (companion actors)
| Tool | Purpose |
|---|---|
| Website Screenshot Scraper (this) | Capture any page visually |
| Website Uptime Checker | Monitor availability / latency |
| Broken Links Checker | Find 404s on your site |
| PageSpeed Insights Scraper | Lighthouse / Core Web Vitals |
| HTTP Headers Checker | Security-headers audit |
| Webpage Text Extractor | Clean article text from HTML |
| URL Expander | Resolve shortlink chains |
All 31 published actors free to inspect on Apify Store.
Custom scraping — pilot tiers
Need element-crop, multi-viewport, login-walled captures, or a different schema (visual-diff metric, OCR'd text-overlay, automatic banner dismissal)? Three tiers:
- Pilot — $97 · 1 actor, basic config, 7-day support. Good entry point — useful for a single visual-regression pipeline or a one-off competitor archival sweep.
- Standard — $297 · custom actor + Slack/email alerts on results, 30-day support. Most QA-automation and competitive-intel projects fit here.
- Premium — $797 · custom actor + dashboard + 90-day support + 1 modification round. For ongoing pipelines (daily multi-breakpoint capture, brand-monitoring rollups).
Email: spinov001@gmail.com — drop the URL list and the schema you need; quote within 48h.
Proof of work: 31 published Apify scrapers (78 total in portfolio) — Trustpilot 949 runs, Reddit 80+, Google News 43, Glassdoor 37, Email Extractor 36+. Recently delivered a paid 3-article series for a client in the proxy industry ($150).
More tips: t.me/scraping_ai · blog.spinov.online
Disclaimer
Designed for QA, archival, and competitive-research use. Respect target-site Terms of Service, applicable data-protection law (GDPR, CCPA), and capture publicly accessible pages only. Not affiliated with any of the example domains shown.
Honest disclosure: 10 output fields per success record (url, title, screenshotKey, screenshotUrl, format, width, height, fullPage, fileSize, scrapedAt). All 8 input fields now exposed in INPUT_SCHEMA (UI form). Total minimum settle = 2000 ms hardcoded + waitTime (default total = 4000 ms). Sequential processing — no parallelism. Per-URL errors push an error record and continue; browser-crash aborts run. No element-crop, no cookie / session injection, no auto-scroll for lazy-load, no Cloudflare / captcha bypass, no proxy. Wait strategy is domcontentloaded + soft load + fixed waitTime — networkidle is intentionally avoided because it hangs on SSE / WebSocket sites.