Hybrid Ad Intelligence Scraper
Pricing
from $0.20 / 1,000 results
Hybrid Ad Intelligence Scraper
Scrape competitor ads from Google Ads Transparency & Meta Ads Library, optionally enriched with Poweradspy metrics. Features LLM self-healing fallback, landing page tech detection (Shopify, WooCommerce), global deduplication, and automated inspiration reports.
Pricing
from $0.20 / 1,000 results
Rating
0.0
(0)
Developer
Solutions Smart
Actor stats
0
Bookmarked
10
Total users
7
Monthly active users
7 days ago
Last modified
Categories
Share
A resilient, hybrid ad scraper that marries free public ad libraries (Google Ads Transparency, Meta Ads Library) with premium Poweradspy intelligence. It features self-healing extraction using an LLM-powered Stagehand fallback, HTTP-first cost-control for enrichment, and sophisticated deduplication + reporting.
High-Level Features
- Hybrid sources
google_ads_transparency(Google Ads Transparency Center, experimental selectors + optional LLM fallback).meta_ads_library(official Meta Ads Library Graph API, requiresmetaAccessToken).poweradspy(premium data + engagement metrics; can also generate mock data for testing).
- Modes
- Free Sources Only (
mode: "free"): Use public libraries only. - Auto / Premium (
mode: "auto"): If Poweradspy credentials are present, enrich / collect via Poweradspy as well. - Poweradspy Only (
mode: "poweradspy"): Skip free sources and use only Poweradspy.
- Free Sources Only (
- Resilience first
- Default extraction uses fast Playwright selectors.
- When selectors fail or return 0, a Stagehand LLM fallback can take over (if
OPENAI_API_KEY/ANTHROPIC_API_KEYis configured).
- Enrichment & cost control
- HTTP-first landing-page platform detection (Shopify, WooCommerce, etc.) with concurrency + RPM limits.
- Global deduplication across all sources with customizable keys.
Common Input Examples
1. Free Mode (Google only, no credentials)
{"mode": "free","freeSources": ["google_ads_transparency"],"searchKeywords": "coffee","maxResultsPerSource": 50,"googleAdvertisersToScan": 2,"googleDetailPagesToVisit": 3,"stagehandEnabled": true,"extractionEngine": "stagehand_fallback","enableLandingPlatformDetection": true}
Notes:
- Without
OPENAI_API_KEY, the Google selectors are best-effort only; if they return 0 and Stagehand is disabled or has no key, no Google ads will be scraped. - For Google Ads Transparency,
googleAdvertisersToScan: 2andgoogleDetailPagesToVisit: 3are strong default settings for balancing result quality, speed, and cost. - See Mock Fallback below for how we still produce sample output in that case.
2. Free Mode with Meta Ads Library
{"mode": "free","freeSources": ["google_ads_transparency", "meta_ads_library"],"keywords": ["coffee", "espresso"],"maxResultsPerSource": 25,"googleAdvertisersToScan": 2,"googleDetailPagesToVisit": 3,"metaAccessToken": "YOUR_META_DEVELOPER_TOKEN"}
Meta notes:
metaAccessTokenis a standard Facebook/Meta Ads Library token withads_readpermission.- If
meta_ads_libraryis selected butmetaAccessTokenis missing, the Actor logs a warning and skips Meta (the run does not fail).
3. Auto / Premium Mode (Poweradspy integration)
{"mode": "auto","freeSources": ["google_ads_transparency"],"keywords": ["coffee"],"poweradspyEmail": "your_email@example.com","poweradspyPassword": "your_password","enablePoweradspyEnrichment": true,"maxResultsPerSource": 50,"googleAdvertisersToScan": 2,"googleDetailPagesToVisit": 3,"maxTotalResults": 100}
In auto mode, free sources run first; if credentials are present, Poweradspy is used to enrich/extend the results.
4. Poweradspy Mock Data (fallback output)
You can generate realistic sample ads without a Poweradspy account via the mock engine:
{"mode": "free","freeSources": ["google_ads_transparency"],"searchKeywords": "coffee","fallbackToMockAdsWhenEmpty": true}
Behavior:
- If free sources (Google/Meta) return 0 ads and Poweradspy was not run, the Actor logs a warning and calls the Poweradspy scraper with
useMockData: trueto produce ~10 sample ads. - To disable this and get truly empty output, set:
{"fallbackToMockAdsWhenEmpty": false}
You can also call Poweradspy directly with useMockData: true in mode: "poweradspy" or mode: "auto" when you want sample output.
Stagehand (LLM) Config
If stagehandEnabled is true and extractionEngine is set to "stagehand_fallback":
- Set
OPENAI_API_KEYin the Actor’s Settings → Environment variables to use the defaultgpt-4omodel, or - Set
ANTHROPIC_API_KEYand"stagehandModelProvider": { "provider": "anthropic" }to use Claude.
When selectors return 0 items or throw, Stagehand opens the Google Ads Transparency page and uses the LLM to:
- Search for your query,
- Scroll / find ad cards in the current DOM,
- Extract fields into a structured
AdItemlist.
If no API key is present, the fallback logs a warning and yields an empty array (the run continues; see Mock Fallback above).
Output Schema & Reports
- Dataset rows follow the schema in
.actor/dataset_schema.json(key fields):adId,platform,title,adCopy,creativeUrl,landingPageUrl,advertiserName,firstSeen,lastSeen,likes,shares,comments,landingPagePlatform,cta,placement,country,scrapedAt.
- HTML/Markdown report (
inspiration-report) groups ads bylandingPagePlatformand annotates each with the source (google_ads_transparency,meta_ads_library,poweradspy) and Poweradspy metrics when available.
Cost Control & Enrichments
- Landing platform detection is done HTTP-first (no full browser), with configurable:
enrichmentMaxConcurrency(parallel HTTP calls),enrichmentMaxRequestsPerMinute(RPM),enrichmentHttpTimeoutMs(per-request timeout).
- For Google Ads Transparency, the main cost controls are:
maxResultsPerSourcegoogleAdvertisersToScangoogleDetailPagesToVisit
- Google Ads Transparency runs without a proxy by default to avoid tunnel errors; Meta and Poweradspy honor the
proxyConfigurationinput (e.g. Apify Residential proxy).
Compliance
Use the Actor responsibly and in compliance with privacy laws, target-site policies, and third-party API terms, including GDPR and CCPA where applicable.
Support
If this Actor helps your workflow, please leave a 5-star rating on the Actor page.
