Google Ads Transparency Scraper avatar
Google Ads Transparency Scraper

Pricing

$20.00/month + usage

Go to Apify Store
Google Ads Transparency Scraper

Google Ads Transparency Scraper

Scrape Google's Ad Transparency Center to monitor competitor ads and extract ad creatives with OCR. Supports Search, YouTube, Shopping, Maps, and Play platforms with configurable time periods and platform filtering. Perfect for competitive intelligence and brand monitoring.

Pricing

$20.00/month + usage

Rating

5.0

(1)

Developer

Shanks

Shanks

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

This Apify actor scrapes Google's Ad Transparency Center to check if domains are running ads and extracts ad creatives with OCR text extraction.

Features

  • Checks if domains are running Google Ads
  • Extracts ad creatives (images and videos)
  • Performs OCR on image ads to extract text
  • Handles YouTube video thumbnails and links
  • Provides detailed statistics and progress tracking
  • Configurable concurrency for faster processing
  • Robust error handling and retries

Input

The actor accepts the following input parameters:

{
"domains": [
"example.com",
"example.org"
],
"maxConcurrency": 1, // Optional, default: 1, max: 10
"timePeriod": "Last 30 days", // Optional, default: "Last 30 days"
"startDate": "2018-06-08", // Optional, required if timePeriod is "Custom" (format: YYYY-MM-DD)
"endDate": "2018-06-15", // Optional, required if timePeriod is "Custom" (format: YYYY-MM-DD)
"platform": "All" // Optional, default: "All" - filter by Google platform
}
  • domains: Array of domains to check for ads (required)
  • maxConcurrency: Maximum number of domains to process concurrently (optional)
  • timePeriod: Time period to check ads for (optional). Options:
    • "Today" - Check ads from today
    • "Yesterday" - Check ads from yesterday
    • "Last 7 days" - Check ads from the last 7 days
    • "Last 30 days" - Check ads from the last 30 days (default)
    • "Custom" - Use custom date range (requires startDate and endDate)
  • startDate: Start date for custom date range (format: YYYY-MM-DD). Required if timePeriod is "Custom"
  • endDate: End date for custom date range (format: YYYY-MM-DD). Required if timePeriod is "Custom"
  • platform: Google platform to filter ads by (optional). Options:
    • "All" - Check ads across all platforms (default)
    • "SEARCH" - Google Search ads only
    • "YOUTUBE" - YouTube ads only
    • "SHOPPING" - Google Shopping ads only
    • "MAPS" - Google Maps ads only
    • "PLAY" - Google Play ads only

Output

The actor saves results to its default dataset. Each item contains:

{
"domain": "example.com",
"ads_running": true,
"creatives": [
{
"type": "image",
"url": "https://..."
},
{
"type": "video",
"url": "https://..."
}
],
"ad_texts": [
"Extracted text from image 1",
"Extracted text from image 2"
],
"time_period": "Last 30 days",
"start_date": null,
"end_date": null,
"platform": "All",
"error": null, // Error message if scraping failed
"timestamp": "2024-03-21T12:34:56.789Z"
}

Usage

  1. Create a new task for the actor
  2. Provide input:
    {
    "domains": ["example.com"],
    "maxConcurrency": 1,
    "timePeriod": "Last 7 days",
    "platform": "YOUTUBE"
    }
  3. Run the task
  4. Get results from the dataset

Performance and Limits

  • Memory: 4096 MB
  • Timeout: 4 hours
  • Concurrency: 1-10 domains in parallel
  • Rate limiting: 2 second delay between requests

Dependencies

  • Python 3.9
  • Chrome browser
  • Tesseract OCR
  • Key Python packages:
    • selenium
    • pytesseract
    • aiohttp
    • Pillow
    • apify-client

Error Handling

The actor implements robust error handling:

  • Automatic retries for transient errors
  • Graceful degradation for OCR failures
  • Detailed error reporting in output
  • Progress tracking and statistics

Development

  1. Install dependencies:

    $pip install -r requirements.txt
  2. Install system dependencies:

    $apt-get install tesseract-ocr
  3. Run locally:

    $python main.py

License

Apache 2.0