Google Ads Transparency Scraper
Pricing
Pay per event
Google Ads Transparency Scraper
Scrape ad creatives from the Google Ads Transparency Center by advertiser domain or advertiser ID — creative, format, regions, first/last shown, landing URL — export to JSON or CSV. A Google Ads Transparency API alternative and data exporter. You pay only for ads that land.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
8
Total users
2
Monthly active users
7 days ago
Last modified
Categories
Share
🎯 What this scrapes
The Google Ads Transparency Center is Google's public registry of every ad campaign running on Search, YouTube, Display, Shopping, Maps, and Play. This Google Ads Transparency Scraper talks directly to Google's internal SearchService/SearchCreatives RPC — so it pulls fast, stable structured data without the overhead of a full browser session.
Google publishes no official API for this data. We reverse-engineered the RPC, replay it with a real-browser TLS fingerprint, and absorb all the reliability work so you get clean rows.
Per creative you get:
| Field | Type | Notes |
|---|---|---|
advertiser_id | string | Google's stable advertiser identifier (e.g. AR0123…) |
advertiser_name | string | Public-facing brand name |
creative_id | string | Stable per-creative ID (e.g. CR0123…) |
creative_url | string | Deep link into Transparency Center |
landing_domain | string | Click-through domain |
format_type | integer | Numeric format code (1=text, 2=image, 3=video — inferred) |
first_shown_ts | integer | Unix seconds, first observed impression |
last_shown_ts | integer | Unix seconds, last observed impression |
impressions | integer | Google-reported impression count |
preview_image_url | string | null | Static thumbnail (image creatives) |
preview_content_js_url | string | null | JS bundle URL (video/rich creatives) |
region | string | Locale label you passed (display only) |
scraped_at | string | ISO-8601 UTC timestamp |
🔥 Features
- Direct-RPC delivery — calls Google's internal
SearchCreativesendpoint; roughly 40 ads per page, no browser overhead. - Batch input — scrape multiple domains and advertiser IDs in a single run, deduplicated automatically.
- Proxy-aware pagination — sticky-session proxies thread through every RPC call for cookie continuity across hundreds of pages.
- Golden-file tested — every parser change runs against 4 captured creative shapes (still image, rich video, minimal, malformed) before shipping.
- Live wire validation — opt-in smoke tests catch RPC contract drift before users do.
- Flat, stable output schema — predictable JSON drops straight into Pandas, BigQuery, or a vector store.
- Pay-per-result — the
actor-startevent covers warm-up; you're only charged for ad rows that actually land in the dataset.
💡 Use cases
- Competitor ad-spend tracking — pull every Nike ad once a week and diff the creative set to see what launched.
- Trademark enforcement — monitor advertisers running ads against your brand keyword; combine with your own takedown workflow.
- Affiliate-fraud detection — flag advertisers whose landing domain doesn't match the advertiser name.
- Political-ad monitoring — track which advertisers are active in an election cycle.
- Brand-safety audits — for agencies, prove the ads currently live for a client before the QBR.
- Market research — observe how saturated a vertical (crypto, sports betting, supplements) is with active creatives.
- AI / RAG ingestion — feed creative metadata and image URLs into a vector store for image-grounded competitive analysis.
⚙️ How to use it
- Click "Try for free" at the top of the page.
- Paste one or more brand domains into the
Brand domainsfield (e.g.nike.com,adidas.com). One per line. Each domain spawns its own scrape. - (Optional) Drop in advertiser IDs if you already know them — they look like
AR0123456789and live in the Transparency Center URL when you click into an advertiser. - (Optional) Set a date window to narrow ad activity. Defaults to the last 365 days.
- Run. Each ad is one row in the dataset; export to JSON, CSV, or Excel from the Storage tab.
The first run on a new account uses $5 of free Apify credit — that's roughly 4 000 ads at our pricing.
📥 Input
The schema lives in .actor/input_schema.json. The fields:
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
searchDomains | array of string | one of | ["nike.com"] | Brand landing domains, one per line |
advertiserIds | array of string | one of | [] | Google advertiser IDs (AR…) |
region | enum string | no | anywhere | Display-only — Google's RPC does not filter by region (see Limitations) |
dateFrom | string (YYYY-MM-DD) | no | 365 days back | Lower bound of ad-activity window |
dateTo | string (YYYY-MM-DD) | no | today (UTC) | Upper bound |
maxResults | integer | no | 1000 | Total dataset items across all targets. 0 = unlimited |
maxPages | integer | no | 25 | RPC budget per target (40 ads × 25 pages = 1 000 / target) |
proxyConfiguration | proxy config | no | Apify Proxy enabled | Sticky session recommended |
At least one of searchDomains or advertiserIds must contain at least one entry.
Example input
{"searchDomains": ["nike.com", "adidas.com"],"advertiserIds": ["AR03012025048987521025"],"region": "US","dateFrom": "2025-11-15","dateTo": "2026-05-15","maxResults": 5000,"maxPages": 25,"proxyConfiguration": { "useApifyProxy": true }}
📤 Output
Every row is one creative. Example:
{"advertiser_id": "AR18378488041124659201","advertiser_name": "Nike Retail BV","creative_id": "CR15771942603307614209","creative_url": "https://adstransparency.google.com/advertiser/AR18378488041124659201/creative/CR15771942603307614209?region=anywhere","landing_domain": "nike.com","format_type": 1,"first_shown_ts": 1761145807,"last_shown_ts": 1778871417,"impressions": 205,"preview_image_url": "https://tpc.googlesyndication.com/archive/simgad/12774179880874022668","preview_content_js_url": null,"region": "anywhere","scraped_at": "2026-05-15T19:17:59+00:00"}
Export options once the run finishes:
- JSON — full payload, ideal for AI/RAG pipelines
- CSV / Excel — for analyst spreadsheets; sort by
impressionsto find big-spender ads - JSONL — line-delimited, easy to stream into a warehouse
- API — fetch programmatically via
GET /v2/datasets/{id}/items; webhook onACTOR.RUN.SUCCEEDEDfor live pipelines
💰 Pricing
Pay-per-event. You pay for what you get, nothing for what you ask for:
| Event | Price | When charged |
|---|---|---|
actor-start | $0.005 | Once per run (covers warm-up + cookie handshake) |
ad-result | $0.0012 | Per ad creative written to the dataset |
Examples:
| Pull | Cost |
|---|---|
| 100 ads | $0.13 |
| 1 000 ads | $1.21 |
| 10 000 ads | $12.01 |
| 100 000 ads (monthly competitor sweep) | $120.05 |
Compare to: building this in-house is roughly two engineer-weeks plus the ongoing cost of maintaining a proxy pool and the TLS-fingerprint replay loop. We have already done it.
🚧 Limitations
- Region is metadata, not a filter. Google's
SearchCreativesRPC ignores the geo target — we confirmed this empirically (seescripts/recon/FINDINGS.md). The Transparency Center browser UI shows a region selector, but the server returns the same creative set regardless. We exposeregionso you can tag exports by intended locale, nothing more. - No region-only browsing. You must supply a
searchDomainoradvertiserId. There is no "all ads in country X" mode on the public RPC. If Google adds one we'll wire it in. - Video / rich creatives return a
content.jsURL, not an MP4. Rendering the actual video frame requires executing Google's JS bundle — out of scope for v1. - Date range is enforced by Google, not us. They retain roughly 12 months of history. Requesting older dates clips to that window.
- Large advertisers hit pagination caps. Google's infrastructure stops responding past roughly 1 000 ads per query. Nike's library claims ~300 000 ads; the default
maxPages=25is intentionally conservative. Raise it for full-history pulls knowing you may hit the server-side ceiling.
❓ FAQ
Is this legal?
Yes. The Google Ads Transparency Center is a public registry Google operates under EU DSA and US regulatory pressure. We scrape only what the public UI exposes at a polite cadence, and we do not bypass authentication. We also do not collect personal data — only advertiser-level metadata.
Does Google have an official API for Ads Transparency data?
No. As of 2026, Google publishes no official API for the Transparency Center. We reverse-engineered the internal SearchCreatives RPC, replay it with a real-browser fingerprint, and keep the implementation current as the endpoint evolves. The "google ads transparency api" you may have searched for is exactly what this Actor provides.
Why is the region selector marked "display only"?
Because we empirically confirmed the RPC ignores it. Other scrapers on the Store claim region filtering; we tested every plausible RPC body shape and none returned a region-narrowed result set. We would rather under-promise than ship broken filtering. If Google adds a server-side region filter, we will wire it in immediately.
Why isn't there a search-by-keyword mode?
Google's RPC does not expose one. You search by advertiser. For brand-keyword monitoring, give us the domain (e.g. nike.com) and the scraper returns every ad pointing at that domain — including those bought by competitors bidding on your name.
Can I scrape political ads specifically?
Not yet — political ads live in a separate Google library with its own endpoints. Open an issue on the Apify Store listing if you want this; we will prioritize based on demand.
How do I export to Google Sheets or a database?
Three options:
- Console → Storage → Export for one-off CSV downloads.
- Webhook the dataset URL to a Make / Zapier flow that appends to Sheets.
- Apify integration nodes in Airbyte, n8n, or your warehouse loader.
Some preview URLs are null. Why?
Rich, video, and animated creatives expose only a content.js URL — Google renders the preview via JavaScript. Static image creatives give you a direct preview_image_url. If you need actual video frames, post-process the content.js URL with a headless browser downstream.
The number of returned ads is less than Google's reported total. Why?
Google paginates and stops responding past an internal limit we have observed at roughly 1 000 ads per query. For very large advertisers, raise maxPages beyond the default of 25 if you need fuller coverage.
How do I scrape google ads transparency center data automatically on a schedule?
Go to Apify Console → Schedules, attach this Actor, and set your cron. Weekly is the right cadence — Google updates the Transparency Center daily at most. More frequent polling wastes credit without new signal.
What integrations does this Actor support?
- Schedule — Apify Console → Schedules tab → run weekly for monitoring.
- Webhooks — register
ACTOR.RUN.SUCCEEDEDto fire your downstream pipeline as soon as the dataset is final. - API —
POST /v2/acts/DevilScrapes~google-ads-transparency/run-sync-get-dataset-itemsreturns the full result set in one synchronous call (good for up to a few thousand ads). - Make / Zapier — every Apify Actor surfaces as a node out of the box.
- n8n — use the Apify community node; a workflow template is available on n8n.io/workflows.
💬 Your feedback
Spotted a bug, missing field, or want a new feature? Open an issue on the Apify Store listing — we read every one.
Built by Devil Scrapes — Apify Actors with attitude. PPE, transparent pricing, no junk fields.