google-ads-transparency-scraper avatar
google-ads-transparency-scraper

Under maintenance

Pricing

Pay per usage

Go to Store
google-ads-transparency-scraper

google-ads-transparency-scraper

Under maintenance

Developed by

Traffic Architect

Traffic Architect

Maintained by Community

Google Ads Transparency Scraper Apify actor that extracts competitor ad data from Google Ads Transparency Center. Features batch processing, FULL/LITE modes, date/region filtering, and proxy support. Automates competitive intelligence gathering for marketers and analysts. Outputs to Apify datasets.

0.0 (0)

Pricing

Pay per usage

0

Total users

4

Monthly users

4

Runs succeeded

>99%

Last modified

4 days ago

Google Ads Transparency Scraper (Python)

🔎 What is this Actor?

This Apify Actor is a Python-based tool designed to scrape data from the Google Ads Transparency Center. It allows you to extract information about advertisers and their ads based on a list of input keywords and flexible date range options.

📊 What Google ads data can I extract?

The Actor supports two run modes:

  • FULL Mode: Extracts detailed information for each ad creative.
  • LITE Mode: Extracts a summary for each matched advertiser, providing counts of their ad creatives by format.

📖 How to use

⬇️ Input

The Actor expects a JSON input with the following fields:

  • keywords (array of strings, required): A list of search terms (e.g., advertiser names, domains) to look for. The actor will process each keyword.
  • runMode (string, optional, default: "FULL"):
    • "FULL": Fetches detailed ad information for each creative.
    • "LITE": Fetches only counts of ad creatives per format for each matched advertiser.
  • dateRangePreset (string, optional, default: "ANYTIME"): Controls the date range for fetching ads. Options:
    • "ANYTIME": No date filtering.
    • "LAST_7_DAYS": Ads shown in the last 7 days.
    • "LAST_30_DAYS": Ads shown in the last 30 days.
    • "CUSTOM_RANGE": Specify a custom date range using customStartDate and customEndDate.
  • customStartDate (string, optional, format: YYYY-MM-DD): The start date for a custom range (e.g., "2023-01-01"). Used only if dateRangePreset is "CUSTOM_RANGE".
  • customEndDate (string, optional, format: YYYY-MM-DD): The end date for a custom range (e.g., "2023-01-31"). Used only if dateRangePreset is "CUSTOM_RANGE".
  • count (integer, optional, default: 10):
    • In FULL mode: The maximum number of ad creatives to retrieve for each matched advertiser within the specified date range.
    • In LITE mode: This parameter is used as a guideline for how many creative summaries to fetch for counting. The actor uses the totalAdCount from the advertiser suggestion if available (this count is usually for "anytime"), otherwise defaults to fetching a larger batch (e.g., 2000 summaries) for counting within the specified date range.
  • region (string, optional, default: "anywhere"): The region code to filter ads by (e.g., 'US', 'GB'). Use "anywhere" for no specific region.
  • proxyConfig (object, optional): Standard Apify proxy configuration.

Example Input (FULL Mode with Custom Date Range):

{
"keywords": ["Google LLC"],
"runMode": "FULL",
"dateRangePreset": "CUSTOM_RANGE",
"customStartDate": "2024-01-01",
"customEndDate": "2024-01-31",
"count": 5,
"region": "US"
}

Example Input (LITE Mode, Last 7 Days):

{
"keywords": ["Niantic, Inc."],
"runMode": "LITE",
"dateRangePreset": "LAST_7_DAYS",
"region": "anywhere"
}

⬆️ Output

The extracted data is stored in the default Apify dataset. The structure of items in the dataset depends on the selected runMode.

Output Structure for runMode: "FULL"

Each item is a JSON object representing a detailed ad creative with the following fields:

  • originalKeyword (string): The input keyword that led to this ad being scraped.
  • advertiserId (string): The unique ID of the advertiser.
  • advertiserName (string): The name of the advertiser.
  • creativeId (string): The unique ID of the ad creative.
  • format (string): The format of the ad. Possible values: "TEXT", "IMAGE", "VIDEO", "UNKNOWN".
  • previewUrl (string | null): URL to the ad preview or its landing page, if available.
  • regionStats (array of objects): A list detailing the ad's presence in different regions (relevant to the query's region and date settings). Each object contains:
    • regionCode (string | null): Two-letter country code (e.g., "US", "GB") or null if not specific.
    • regionName (string | null): Full name of the region (e.g., "United States") or null.
    • firstShown (string | null): Date the ad was first shown in this context (YYYY-MM-DD format), if available.
    • lastShown (string | null): Date the ad was last shown in this context (YYYY-MM-DD format), if available.
    • impressions (null): Placeholder for future use; currently not populated.
    • surfaceServingStats (array): Placeholder for future use; currently an empty list.
  • variations (array of objects): A list of different versions or assets associated with the ad. Each variation object may contain:
    • description (string | null): Ad body text, typically for TEXT ads.
    • headline (string | null): Ad headline, typically for TEXT ads.
    • cta (string | null): Call To Action text (e.g., "Learn More", "Shop Now").
    • videoUrl (string | null): URL of the video asset, for VIDEO ads.
    • imageUrl (string | null): URL of the image asset, for IMAGE ads.
    • clickUrl (string | null): The destination URL when the ad variation is clicked.
  • error (string | null): An error message if fetching details for this specific ad failed. Null if successful. Note: If no variations are found or applicable for an ad, the variations array will contain a single default empty variation object to maintain structure.

Output Structure for runMode: "LITE"

Each item is a JSON object summarizing ad counts for a matched advertiser, filtered by the specified date range and region:

  • originalKeyword (string): The input keyword that led to this advertiser summary.
  • advertiserId (string): The unique ID of the advertiser.
  • advertiserName (string): The name of the advertiser.
  • textCreativeCount (integer): Number of TEXT ad creatives found for this advertiser within the specified parameters.
  • imageCreativeCount (integer): Number of IMAGE ad creatives.
  • videoCreativeCount (integer): Number of VIDEO ad creatives.
  • unknownFormatCount (integer): Number of creatives with an undetermined format.
  • regionSearched (string): The region parameter used for this count (e.g., "US", "anywhere").
  • totalCreativesListedInSuggestion (integer): The estimated total ad count for this advertiser as initially reported by the search suggestion (this is typically an "anytime" count and may not reflect the specified date range).
  • totalCreativesCountedFromSearch (integer): The actual number of creative summaries processed from the SearchCreatives API call to derive the format counts for the specified date range and region.

⚙️ Setup and Running

This Actor is designed to run as a Docker container on the Apify platform.

  1. Build the Actor: apify build (from within the Actor's directory) Or, manually with Docker: docker build -t google-ads-scraper-actor .

  2. Run the Actor: apify run (this will use the INPUT.json file in the Actor's .actor directory if present, or you can specify input via CLI or Apify Console)

  3. Push to Apify Platform: apify push

❓ Frequently Asked Questions (FAQs)

Is it legal to scrape Google Ads data? Scraping publicly available data is generally permissible, but you should always be mindful of the website's terms of service, robots.txt, and relevant data privacy regulations (like GDPR, CCPA). Ensure your scraping activities are ethical and do not overload the target servers. Consult legal advice if you are unsure.

💬 Your feedback

If you have any feedback or feature requests, please let us know!