Scrape any advertiser's Facebook & Instagram ads from Meta's Ad Library: creatives, copy, CTA, run dates. $3.50 per 1,000 ads — no Meta API approval needed.
All notable changes to this actor are documented here. Format loosely follows Keep a Changelog ; the actor follows semantic versioning over the output contract.
[Unreleased]
Changed
Static resources blocked during page loads: images, fonts, and video (.jpg/.png/.webp/.woff/.mp4/…) are aborted before they download — ad data comes from SSR HTML, so creative pixels never reach the parser. Cuts proxy bandwidth on every tier; matters most on residential ($8/GB), where an all-escalated run previously transferred ~8 MB/query. CSS and JS are intentionally still loaded (no-style page loads are a bot-detection signal; the SSR payload arrives in script tags). Output contract unchanged.
[0.5] — 2026-06-04
Added
Automatic residential proxy escalation: queries that exhaust datacenter retries with a blocked failure are retried once through Apify's residential proxy group (geo-matched to the run's region) after the datacenter phase completes. Rescued queries count as succeeded; queries blocked on both tiers stay blocked. Residential bandwidth is spent exclusively on queries datacenter already failed. Every escalation is visible: WARN logs, warnings[] entries naming the per-tier outcome, and new run-summary counters queries_escalated / queries_rescued (additive fields).
Changed
queries_blocked, block_rate, and the all-blocked run-failure check (from 0.4) are now evaluated after the residential phase — final-tier accounting. A fully-rescued run is an honest success (exit 0, block_rate: 0.0).
A circuit-breaker abort no longer prevents the rescue attempt: queries already captured as blocked still get their residential retry; the unprocessed remainder of the queue stays unprocessed and the run still exits non-zero.
[0.4] — 2026-06-04
Fixed
HTTP-status blocks now categorized blocked: Crawlee's middleware converts 403/429 responses into SessionError before the request handler runs; these were falling through to the parse category, so block_rate in the run summary read 0.0 even when every query was blocked (observed in run cthSbYR5cqtkaZ6Z3). SessionError (and any session-named exception) now counts as blocked, making block_rate and the per-query (blocked) warnings trustworthy.
Changed
All-blocked runs now fail: when every query in a run fails terminally and at least one failure is categorized blocked, the run exits non-zero with aborted: true and aborted_reason naming systemic blocking — even below the 5-consecutive circuit-breaker threshold. Previously such runs exited 0 ("Succeeded") with zero ads. Note: this lowers the actor's public success-rate stat for runs that were never actually succeeding; an honest failure beats a successful-looking zero-ad run. No output fields added or removed.
[0.3] — 2026-06-04
Changed
Forgiving input validation: when the selected search_mode's required field is empty but exactly one other lookup field is populated (e.g. search_mode: "keyword" with only queries filled in), the run now auto-corrects search_mode to match the populated field instead of failing. Every auto-correction emits a WARN log and an entry in the run summary warnings[] naming the original mode, the corrected mode, and the driving field. Inputs with zero populated lookup fields, or with two or more populated while the mode's own field is empty (ambiguous intent), still fail validation — the ambiguous-case error now lists the populated fields and the modes they imply. Existing valid inputs behave exactly as before.
[0.2] — 2026-05-16
Added
match_type input field ("broad" | "exact", default "broad"). "broad" maps to Meta's search_type=keyword_unordered (current behavior, no change for existing runs); "exact" maps to search_type=keyword_exact_phrase. Applies to keyword, page_url, and batch modes; ignored for page_id mode.
Changed
max_session_rotations raised from the Crawlee default of 10 to 30, giving each query up to 30 fresh datacenter IPs from BUYPROXIES94952 before failing. Helps transient 403 blocks clear without escalating to residential proxy.
Fixed
Output schema templates use links.apiDefault* instead of the non-existent links.publicDefault*, so the Output tab now renders dataset and run-summary links.
[1.0.0] — Unreleased
Initial public release.
Added
Four lookup modes (page_id, page_url, keyword, batch) with per-mode input validation.
page_url resolution via keyword + page_profile_uri handle-match filter — never silently returns a near-match.
Per-item output schema (28 fields) including lifecycle dates (start_date, end_date) extracted from Meta's SSR.
Opt-in pagination (follow_pagination) with hard cap (max_ads_per_advertiser, 1–500).
Pydantic schema validation with non-silent _validation_errors envelope.
Per-run summary written to KV store as OUTPUT.json (block rate, validation error rate, per-advertiser totals + cursors, warnings).