Pricing

Pay per usage

Website Screenshot — Full Pages, Any Resolution, PNG, No Limits

20 runs. Website screenshots as PNG/JPG/PDF in 2 min — full-page, desktop + mobile, custom viewport, bulk URL input. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For competitor visual tracking + UX research. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alex

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Website Screenshot Scraper — Playwright PNG/JPEG Capture, Custom Viewport

Capture batches of webpage screenshots — full-page or viewport-only, PNG or JPEG, custom width/height — to an Apify key-value store. Zero local browser install, zero Playwright boilerplate.

Headless Chromium (via Playwright) loads the URL with domcontentloaded waiting strategy, optionally waits for a CSS selector, captures the screenshot, stores it in the run's key-value store, and pushes one dataset record per URL with the signed image URL plus capture metadata.

What you actually get (verified against `src/main.js`)

Output schema — one record per URL

{
  "url": "https://stripe.com",
  "title": "Stripe | Financial Infrastructure for the Internet",
  "screenshotKey": "screenshot_stripe_com_1714398900000",
  "screenshotUrl": "https://api.apify.com/v2/key-value-stores/<storeId>/records/<screenshotKey>",
  "format": "png",
  "width": 1280,
  "height": 720,
  "fullPage": false,
  "fileSize": 286410,
  "scrapedAt": "2026-04-29T12:00:00.000Z"
}

10 fields per success record. On error, the actor pushes { url, error: "<reason>", scrapedAt } so your downstream pipeline can retry the failures selectively.

screenshotUrl points to the file inside the Apify run's default key-value store. The store is retained per Apify's plan defaults (typically 14 days on free tier; longer on paid). For permanent retention, post-process the run via Apify webhook → S3 / R2 / your own object store.

Input (full schema, all 8 fields exposed in UI)

Parameter	Type	Default	Range	Description
`urls`	array	`[]`	required, ≥1	List of URLs. Plain hostnames (`stripe.com`) get `https://` prepended automatically.
`fullPage`	boolean	`false`	—	`true` for entire scroll height; `false` for viewport-only.
`width`	integer	`1280`	320–3840	Browser viewport width in pixels.
`height`	integer	`720`	240–2160	Browser viewport height in pixels.
`format`	string	`"png"`	`"png"` \| `"jpeg"`	Output format.
`quality`	integer	`80`	1–100	JPEG quality. Ignored when `format="png"`.
`waitForSelector`	string	`""`	CSS selector	Optional CSS selector — actor waits up to 10 s for it before capturing. Empty = skip.
`waitTime`	integer	`2000`	0–30000 ms	Extra delay after load before capture. Note: a hardcoded 2000 ms settle ALSO runs before this — total minimum settle = 2000 + `waitTime` (default total = 4000 ms).

How it works

Launch headless Chromium (Playwright chromium.launch({ headless: true })).
New browser context with the requested width × height viewport.
For each URL:
- page.goto(url, { waitUntil: 'domcontentloaded', timeout: 45000 }).
- page.waitForLoadState('load', { timeout: 15000 }) — best-effort, swallows timeout.
- page.waitForTimeout(2000) — settles late-paint elements.
- If waitForSelector: page.waitForSelector(selector, { timeout: 10000 }), swallows timeout.
- Additional page.waitForTimeout(waitTime) if waitTime > 0.
- page.screenshot({ fullPage, type: format, quality? }).
- Save buffer to KV store as screenshot_<domain>_<epochMs>.
- Push one dataset record.

Why not networkidle? Pages with persistent SSE / WebSockets / live analytics (Stripe, Linear, Vercel) never reach networkidle — Playwright would time out on them. The actor explicitly uses domcontentloaded + a soft load wait + a fixed waitTime to handle late-paint reliably.

Honest limitations (read before bulk runs)

Total minimum settle = 2000 ms + waitTime. There's a hardcoded page.waitForTimeout(2000) AFTER domcontentloaded and BEFORE the configurable waitTime. Default total is 2000 + 2000 = 4000 ms of fixed delay per URL on top of network time. For very-fast pages this is overkill; for very-slow SPA shells it may still be too short — adjust waitTime.
Single browser context, sequential URL processing. The actor opens ONE Chromium context and processes URLs in a for loop. 100 URLs × ~6 s wall-clock each ≈ 10 min. No parallelism.
One outer try/catch wraps browser launch only — per-URL errors are caught. If a single URL fails (timeout, DNS, navigation error), the actor pushes { url, error, scrapedAt } and CONTINUES to the next URL. However, if the browser itself crashes mid-batch, the whole run aborts (no auto-relaunch).
Cloudflare Turnstile / hCaptcha / anti-bot walls block the actor. Standard headless Chromium fingerprint — no stealth plugins. Cloudflare will challenge or block; expect either an error record or a screenshot of the challenge page.
No login / cookie injection. Fresh browser context per run. Pages behind auth render their pre-login state. Login-walled captures = custom build.
No element-crop, no auto-scroll for lazy-load images, no banner-dismissal heuristics. Full-page captures of cookie-banner-heavy sites will show the banner overlay. Custom build can dismiss common banners (Cookiebot, OneTrust, Quantcast).
No proxy. Direct browser launch on Apify worker IP. Geo-restricted pages render with worker's region (typically US/EU).
Screenshot retention is per-Apify-plan default — typically 14 days on free tier, longer on paid. For permanent retention, copy via webhook to your own S3 / R2 / Backblaze.
Filename is timestamp-keyed screenshot_<domain>_<epochMs> — repeated captures of the same URL produce DIFFERENT keys (no overwrite). Useful for archival; means key-value store grows unbounded — manage retention yourself.
title is page document.title AT capture time — for SPAs the title may still be the shell's default if hydration hasn't completed within the 4 s settle window.
waitForSelector timeout (10 s) is silent — if the selector never appears, the actor proceeds with the current page state (caught .catch(() => {})).
urls = [] is silently accepted — actor exits without pushing any records (only browser launch logs).

Who buys this actor

Visual-regression QA engineers running nightly screenshot diffs against staging + prod to catch CSS regressions before users do.
Competitive-intel teams archiving weekly snapshots of competitor landing pages (pricing, feature lists, hero copy) for deal-review decks.
Content archival / journalism preserving webpage state for takedown resilience (source of truth when a page later changes or 404s).
Link-preview / OG-image fallback services generating thumbnail cards for social feeds when the upstream page lacks proper og:image tags.
Brand / trademark monitoring capturing how your logo or copy is displayed on partner, affiliate, and unauthorized resale sites.
MCP / LLM-agent tools giving an agent the ability to "see" a webpage when DOM-only context isn't enough.

Python example — visual-regression diff

Capture the same set of paths twice (staging + prod) and flag any byte-size delta >5%:

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

pages = ["/", "/pricing", "/docs", "/blog", "/login"]

def capture(base_url: str) -> dict[str, dict]:
    run = client.actor("knotless_cadence/website-screenshot-scraper").call(run_input={
        "urls": [base_url + p for p in pages],
        "fullPage": True,
        "width": 1440,
        "height": 900,
        "format": "png",
    })
    items = list(client.dataset(run["defaultDatasetId"]).iterate_items())
    return {i["url"].replace(base_url, ""): i for i in items}

staging = capture("https://staging.example.com")
prod    = capture("https://www.example.com")

for path in pages:
    s = staging.get(path, {}).get("fileSize", 0)
    p = prod.get(path, {}).get("fileSize", 0)
    if p and abs(s - p) / p > 0.05:
        print(f"⚠  {path}  diff {((s-p)/p)*100:+.1f}%   staging={s}B  prod={p}B")
        print(f"   {staging[path]['screenshotUrl']}")
        print(f"   {prod[path]['screenshotUrl']}")

MCP / LLM-agent integration

tools = [{
    "name": "capture_webpage",
    "description": "Take a screenshot of a webpage and return the image URL.",
    "input_schema": {
        "type": "object",
        "properties": {
            "url": {"type": "string"},
            "fullPage": {"type": "boolean", "default": False},
        },
        "required": ["url"],
    },
}]

def capture_webpage(url: str, full_page: bool = False) -> str:
    run = client.actor("knotless_cadence/website-screenshot-scraper").call(run_input={
        "urls": [url], "fullPage": full_page, "format": "png",
    })
    return list(client.dataset(run["defaultDatasetId"]).iterate_items())[0]["screenshotUrl"]

Pair with Claude Vision / GPT-4o for accessibility audits, brand-compliance checks, or end-to-end QA that tests "looks right" not just "DOM matches".

Common questions

Q: Can I capture a specific element instead of the whole page? A: Not in this actor. Workaround: capture full page, crop locally with Pillow / sharp using the element's bounding box from a companion DOM query. Available as a custom build (see Custom scraping below).

Q: How do I get screenshots at multiple breakpoints (320, 768, 1440 px) in one run? A: Call the actor 3 times with different width. Native multi-viewport input is on the roadmap but not implemented yet.

Q: What about pages behind a login wall? A: Not supported in v1.0 — the actor uses a fresh browser context per run with no cookie / session injection. Custom build with cookie / session-token injection available on request.

Q: Does this bypass Cloudflare or captchas? A: No. Standard headless Chromium fingerprint. Aggressive bot-protection (Cloudflare Turnstile, hCaptcha) will block the actor.

Q: Can I schedule this nightly? A: Yes — Apify has native cron scheduling. Set the actor to run daily, pipe the output dataset to your webhook / Slack / S3 sync.

Q: How long do screenshots stay accessible? A: Per Apify plan defaults — typically 14 days on free tier, longer on paid. For permanent retention, copy PNGs to your own S3 / R2 / Backblaze bucket via Apify webhook or a post-run script.

Visual / monitoring toolkit (companion actors)

Tool	Purpose
Website Screenshot Scraper (this)	Capture any page visually
Website Uptime Checker	Monitor availability / latency
Broken Links Checker	Find 404s on your site
PageSpeed Insights Scraper	Lighthouse / Core Web Vitals
HTTP Headers Checker	Security-headers audit
Webpage Text Extractor	Clean article text from HTML
URL Expander	Resolve shortlink chains

All 31 published actors free to inspect on Apify Store.

Custom scraping — pilot tiers

Need element-crop, multi-viewport, login-walled captures, or a different schema (visual-diff metric, OCR'd text-overlay, automatic banner dismissal)? Three tiers:

Pilot — $97 · 1 actor, basic config, 7-day support. Good entry point — useful for a single visual-regression pipeline or a one-off competitor archival sweep.
Standard — $297 · custom actor + Slack/email alerts on results, 30-day support. Most QA-automation and competitive-intel projects fit here.
Premium — $797 · custom actor + dashboard + 90-day support + 1 modification round. For ongoing pipelines (daily multi-breakpoint capture, brand-monitoring rollups).

Email: spinov001@gmail.com — drop the URL list and the schema you need; quote within 48h.

Proof of work: 31 published Apify scrapers (78 total in portfolio) — Trustpilot 949 runs, Reddit 80+, Google News 43, Glassdoor 37, Email Extractor 36+. Recently delivered a paid 3-article series for a client in the proxy industry ($150).

More tips: t.me/scraping_ai · blog.spinov.online

Disclaimer

Designed for QA, archival, and competitive-research use. Respect target-site Terms of Service, applicable data-protection law (GDPR, CCPA), and capture publicly accessible pages only. Not affiliated with any of the example domains shown.

Honest disclosure: 10 output fields per success record (url, title, screenshotKey, screenshotUrl, format, width, height, fullPage, fileSize, scrapedAt). All 8 input fields now exposed in INPUT_SCHEMA (UI form). Total minimum settle = 2000 ms hardcoded + waitTime (default total = 4000 ms). Sequential processing — no parallelism. Per-URL errors push an error record and continue; browser-crash aborts run. No element-crop, no cookie / session injection, no auto-scroll for lazy-load, no Cloudflare / captcha bypass, no proxy. Wait strategy is domcontentloaded + soft load + fixed waitTime — networkidle is intentionally avoided because it hangs on SSE / WebSocket sites.

MCP Screenshot — Playwright PNG Capture, Custom Viewport

knotless_cadence/mcp-screenshot

17 runs. Playwright screenshot → PNG in Apify KV. Returns screenshotUrl + title + dimensions + fileSize. Custom viewport. For LLM agents + MCP servers + visual regression. Backed by 951-run Trustpilot flagship + 31-actor portfolio. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Google Maps Scraper — Reviews, Contacts & Leads [No API Key]

knotless_cadence/google-maps-scraper-pro

18 runs. Google Maps: name, address, phone, site, category, rating, reviews, hours, GPS, place-ID. CSV/JSON, no key. Local-biz prospecting + competitor scout + territory mapping. Backed by 951-run Trustpilot flagship + 31-actor portfolio. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Tech Stack Detector — Frameworks, CMS, Analytics, JSON Out

knotless_cadence/website-tech-stack-detector

Competitor tech stack as CSV/JSON in 2 min — frameworks, CMS, analytics, CDN, servers, trackers. No Wappalyzer seat fee, no BuiltWith cap. 19 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

ArXiv Paper Scraper — Search by Category, Bulk JSON, DOI

knotless_cadence/arxiv-paper-scraper

arXiv corpus as JSON — arxivId, title, authors, abstract, categories, dates, DOI, PDF URL. By search OR category. Built for ML/AI training data + lit reviews. 19 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

IMDb Scraper — Ratings, Cast, Genres, JSON/CSV, No Key

knotless_cadence/imdb-movie-scraper

16 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. IMDb titles in JSON/CSV — title, imdbId, type, genres, actors, directors, rating. Bulk by ID or search. No API key. For streaming intel + licensing + recommender training. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

GitHub Trending — CSV Stars, Topics by Period, No Token

knotless_cadence/github-trending-scraper

20 runs. GitHub Trending repos in CSV/JSON — owner, name, url, language, stars, topics. Daily/weekly/monthly + lang filter, no token. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For OSS scouting + VC dealflow. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

GitHub Profile — Repos, Stars, Activity, CSV, No Token, Bulk

knotless_cadence/github-profile-scraper

21 runs. GitHub user intel in CSV/JSON — repos, stars, followers, contribs, languages, bio, email. No API token, no rate blocks. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For recruiter outreach + talent mapping. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Yelp Scraper — Reviews, Ratings, Contacts, CSV, No API Key

knotless_cadence/yelp-business-scraper

Yelp business leads CSV/JSON — name, address, phone, website, rating, reviews, categories by keyword+city. No paid API, no copy-paste. 17 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For local-biz prospecting + SMB lead-gen. spinov001@gmail.com · blog.spinov.online

Alex

IP Geolocation — Country, City, ISP, CSV, No API Key, Bulk

knotless_cadence/ip-geolocation-lookup

20 runs. IP intel as CSV/JSON — country, region, city, ISP, ASN, timezone, lat/lon, isMobile/isProxy flags. Accepts IPs + domains. Backed by 951-run Trustpilot flagship + 31-actor portfolio. For fraud + ad-targeting + GDPR audits. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex

Walmart Reviews Scraper — Product Reviews to CSV/JSON in 2 min

knotless_cadence/walmart-reviews-scraper

25 runs / u7d=1 fresh signal. Backed by 971-run Trustpilot flagship + 32-actor portfolio (2190 lifetime runs). Walmart reviews → CSV/JSON. Bypasses 100-review UI cap. 17 fields: stars, text, author, date, helpful, images. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai

Alex