Google Ads Transparency Scraper avatar

Google Ads Transparency Scraper

Pricing

Pay per event

Go to Apify Store
Google Ads Transparency Scraper

Google Ads Transparency Scraper

Scrape ad creatives from Google Ads Transparency Center by advertiser domain or advertiser ID. Pay-per-event pricing — $1.20 / 1K ads.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share


🎯 What this scrapes

The Google Ads Transparency Center is Google's public registry of every ad campaign running on Search, YouTube, Display, Shopping, Maps, and Play. This Actor talks to its internal SearchService/SearchCreatives RPC directly — no Chrome, no Playwright, no Selenium — so it's fast, stable, and cheap.

Per creative you get:

FieldTypeNotes
advertiser_idstringGoogle's stable advertiser identifier (e.g. AR0123…)
advertiser_namestringPublic-facing brand name
creative_idstringStable per-creative ID (e.g. CR0123…)
creative_urlstringDeep link into Transparency Center
landing_domainstringClick-through domain
format_typeintegerNumeric format code (1=text, 2=image, 3=video — inferred)
first_shown_tsintegerUnix seconds, first observed impression
last_shown_tsintegerUnix seconds, last observed impression
impressionsintegerGoogle-reported impression count
preview_image_urlstring | nullStatic thumbnail (image creatives)
preview_content_js_urlstring | nullJS bundle URL (video/rich creatives)
regionstringLocale label you passed (display only)
scraped_atstringISO-8601 UTC timestamp

💡 Use cases

  • Competitor ad-spend tracking — pull every Nike ad once a week and diff the creative set to see launches.
  • Trademark enforcement — monitor advertisers running ads against your brand keyword. Combine with your own takedown workflow.
  • Affiliate-fraud detection — flag advertisers whose landing domain doesn't match the advertiser name.
  • Political-ad monitoring — track which advertisers are active in an election cycle.
  • Brand-safety audits — for agencies, prove the ads currently live for a client.
  • Market research — observe how saturated a vertical (e.g. crypto, sports betting, supplements) is with ads.
  • AI / RAG ingestion — feed creative metadata + image URLs into a vector store for image-grounded analysis.

⚙️ How to use it

  1. Click "Try for free" at the top of the page.
  2. Paste one or more brand domains into the Brand domains field (e.g. nike.com, adidas.com). One per line. Each domain spawns its own scrape.
  3. (Optional) Drop in advertiser IDs if you already know them — they look like AR0123456789 and live in the Transparency Center URL when you click into an advertiser.
  4. (Optional) Set a date window to narrow ad activity. Defaults to the last 365 days.
  5. Run. Each ad is one row in the dataset; export to JSON, CSV, or Excel from the Storage tab.

The first run on a new account uses $5 of free Apify credit — that's roughly 4 000 ads at our pricing.

📥 Input

The schema lives in .actor/input_schema.json. The fields:

FieldTypeRequiredDefaultNotes
searchDomainsarray of stringone of["nike.com"]Brand landing domains, one per line
advertiserIdsarray of stringone of[]Google advertiser IDs (AR…)
regionenum stringnoanywhereDisplay-only — Google's RPC does not filter by region (see Limitations)
dateFromstring (YYYY-MM-DD)no365 days backLower bound of ad-activity window
dateTostring (YYYY-MM-DD)notoday (UTC)Upper bound
maxResultsintegerno1000Total dataset items across all targets. 0 = unlimited
maxPagesintegerno25RPC budget per target (40 ads × 25 pages = 1 000 / target)
proxyConfigurationproxy confignoApify Proxy enabledSticky session recommended

At least one of searchDomains or advertiserIds must contain at least one entry.

Example input

{
"searchDomains": ["nike.com", "adidas.com"],
"advertiserIds": ["AR03012025048987521025"],
"region": "US",
"dateFrom": "2025-11-15",
"dateTo": "2026-05-15",
"maxResults": 5000,
"maxPages": 25,
"proxyConfiguration": { "useApifyProxy": true }
}

📤 Output

Every row is one creative. Example:

{
"advertiser_id": "AR18378488041124659201",
"advertiser_name": "Nike Retail BV",
"creative_id": "CR15771942603307614209",
"creative_url": "https://adstransparency.google.com/advertiser/AR18378488041124659201/creative/CR15771942603307614209?region=anywhere",
"landing_domain": "nike.com",
"format_type": 1,
"first_shown_ts": 1761145807,
"last_shown_ts": 1778871417,
"impressions": 205,
"preview_image_url": "https://tpc.googlesyndication.com/archive/simgad/12774179880874022668",
"preview_content_js_url": null,
"region": "anywhere",
"scraped_at": "2026-05-15T19:17:59+00:00"
}

Export options once the run finishes:

  • JSON — full payload, ideal for AI/RAG pipelines
  • CSV / Excel — for analyst spreadsheets, sort by impressions to find big-spender ads
  • JSONL — line-delimited, easy to stream into a warehouse
  • API — fetch programmatically via GET /v2/datasets/{id}/items; webhook on ACTOR.RUN.SUCCEEDED for live pipelines

💰 Pricing

Pay-per-event. You pay for what you get, nothing for what you ask for:

EventPriceWhen charged
actor-start$0.005Once per run (covers warm-up + cookie handshake)
ad-result$0.0012Per ad creative written to the dataset

Examples:

PullCost
100 ads$0.13
1 000 ads$1.21
10 000 ads$12.01
100 000 ads (monthly competitor sweep)$120.05

Compare to: building this in-house is ~2 engineer-weeks plus the ongoing cost of maintaining a proxy pool and the TLS-fingerprint replay loop. We've already done it.

🔥 Features

  • Direct-RPC, no browser — calls Google's internal SearchCreatives endpoint via curl-cffi with a Firefox TLS+H2 fingerprint. ~1 second per page (40 ads).
  • 🌍 Batch input — scrape multiple domains and advertiser IDs in a single run, deduplicated automatically.
  • 🛡️ Apify Proxy aware — sticky-session URLs threaded through curl-cffi for cookie continuity across pagination.
  • 📦 Golden-file tested — every parser change runs against 4 captured creative shapes (still image, rich video, minimal, malformed) before shipping.
  • 🔬 Live wire validation — opt-in smoke tests catch RPC contract drift before users do.
  • 🪶 Open output schema — flat, predictable JSON. Drops straight into Pandas, BigQuery, or your vector store.

💡 Tips for best results

  • Search by domain when you can — domains return every ad pointing at that landing page, including ones bought by resellers or affiliates. Advertiser IDs are narrower.
  • Pull bigger date windows in one run — Google's response is paginated, not date-segmented. Asking for 365 days is the same cost per ad as asking for 30 days.
  • Big brands hit the cap fast — Nike's library claims ~300 000 ads. The default maxPages=25 caps at 1 000 ads per domain. Raise it for full-history pulls.
  • Use region as a tag — it doesn't filter the RPC, but if you're running parallel campaigns per market, set region: "US" on the US sweep so your downstream tables can group by it.
  • Run weekly, not hourly — Google updates the Transparency Center daily at most. Scheduling more often just wastes credit.

🚧 Limitations

  • Region is metadata, not a filter. Google's SearchCreatives RPC ignores the geo target — we proved this empirically (see scripts/recon/FINDINGS.md). The Transparency Center's browser UI shows a region selector, but the server returns the same creative set regardless. We expose region purely so you can tag exports by intended locale.
  • No region-only browsing. You must give the scraper a searchDomain or advertiserId. There is no "all ads in country X" mode on the public RPC. If Google adds one we'll ship it.
  • Video / rich creatives return a content.js URL, not an MP4. Rendering the actual video frame requires executing Google's JS bundle — out of scope for v1.
  • Date range is enforced by Google, not us. They retain ~12 months of history. Asking for older dates just clips to that window.

🔌 Integrations

  • Schedule — Apify Console → Schedules tab → run weekly for monitoring.
  • Webhooks — register ACTOR.RUN.SUCCEEDED to fire your downstream pipeline as soon as the dataset is final.
  • APIPOST /v2/acts/DevilScrapes~google-ads-transparency/run-sync-get-dataset-items returns the full result set in one synchronous call (good for ≤ a few thousand ads).
  • Make / Zapier — every Apify Actor surfaces as a node out of the box.
  • n8n — use the Apify community node.

❓ FAQ

Is this legal?

Yes. The Google Ads Transparency Center is a public registry Google operates under EU DSA + US regulatory pressure. We scrape only what the public UI exposes, at a polite cadence (~1 req/s per session), and we do not bypass any authentication. We also don't collect personal data — only advertiser-level metadata.

Why is the region selector marked "display only"?

Because we empirically confirmed the RPC ignores it. Other scrapers on the Store claim region filtering; we tested every plausible RPC body shape and none of them returned a region-narrowed result set. We'd rather under-promise than ship broken filtering. If Google adds a region filter, we'll wire it in.

Why isn't there a search-by-keyword mode?

Because Google's RPC doesn't expose one. You search by advertiser. For brand-keyword monitoring, give us the domain (e.g. nike.com) and the scraper returns every ad pointing at that domain — including those bought by competitors trying to bid on your name.

Can I scrape political ads specifically?

Not yet — political ads live in a separate Google library with its own endpoints. Open an issue on the Apify Store listing if you want this; we'll prioritize based on demand.

How do I export to Google Sheets / a database?

Three options:

  1. Console → Storage → Export for one-off CSV downloads.
  2. Webhook the dataset URL to a small Make / Zapier flow that appends to Sheets.
  3. Apify integration nodes in Airbyte, n8n, or your warehouse loader.

Some preview URLs are null. Why?

Rich / video / animated creatives expose only a content.js URL — Google's preview is rendered by JavaScript. Static images give you a direct preview_image_url. Plan accordingly: if you need actual video frames, post-process the content.js URL with a headless browser.

The number of returned ads is less than Google's reported total. Why?

Google paginates and stops responding past some internal limit (we've observed ~1 000-ad caps per query). For very large advertisers (Nike, Coca-Cola), maxPages = 25 is intentionally conservative. Raise it if you need full coverage.

💬 Support & feedback

Spotted a bug, missing field, or want a new feature? Open an issue on the Apify Store listing — we read every one.

Built by Devil Scrapes — Apify Actors with attitude. PPE, transparent pricing, no junk fields.