Google Ads Transparency Scraper
Pricing
$20.00/month + usage
Google Ads Transparency Scraper
Scrape Google's Ad Transparency Center to monitor competitor ads and extract ad creatives with OCR. Supports Search, YouTube, Shopping, Maps, and Play platforms with configurable time periods and platform filtering. Perfect for competitive intelligence and brand monitoring.
Pricing
$20.00/month + usage
Rating
5.0
(1)
Developer

Shanks
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
This Apify actor scrapes Google's Ad Transparency Center to check if domains are running ads and extracts ad creatives with OCR text extraction.
Features
- Checks if domains are running Google Ads
- Extracts ad creatives (images and videos)
- Performs OCR on image ads to extract text
- Handles YouTube video thumbnails and links
- Provides detailed statistics and progress tracking
- Configurable concurrency for faster processing
- Robust error handling and retries
Input
The actor accepts the following input parameters:
{"domains": ["example.com","example.org"],"maxConcurrency": 1, // Optional, default: 1, max: 10"timePeriod": "Last 30 days", // Optional, default: "Last 30 days""startDate": "2018-06-08", // Optional, required if timePeriod is "Custom" (format: YYYY-MM-DD)"endDate": "2018-06-15", // Optional, required if timePeriod is "Custom" (format: YYYY-MM-DD)"platform": "All" // Optional, default: "All" - filter by Google platform}
domains: Array of domains to check for ads (required)maxConcurrency: Maximum number of domains to process concurrently (optional)timePeriod: Time period to check ads for (optional). Options:"Today"- Check ads from today"Yesterday"- Check ads from yesterday"Last 7 days"- Check ads from the last 7 days"Last 30 days"- Check ads from the last 30 days (default)"Custom"- Use custom date range (requiresstartDateandendDate)
startDate: Start date for custom date range (format: YYYY-MM-DD). Required iftimePeriodis"Custom"endDate: End date for custom date range (format: YYYY-MM-DD). Required iftimePeriodis"Custom"platform: Google platform to filter ads by (optional). Options:"All"- Check ads across all platforms (default)"SEARCH"- Google Search ads only"YOUTUBE"- YouTube ads only"SHOPPING"- Google Shopping ads only"MAPS"- Google Maps ads only"PLAY"- Google Play ads only
Output
The actor saves results to its default dataset. Each item contains:
{"domain": "example.com","ads_running": true,"creatives": [{"type": "image","url": "https://..."},{"type": "video","url": "https://..."}],"ad_texts": ["Extracted text from image 1","Extracted text from image 2"],"time_period": "Last 30 days","start_date": null,"end_date": null,"platform": "All","error": null, // Error message if scraping failed"timestamp": "2024-03-21T12:34:56.789Z"}
Usage
- Create a new task for the actor
- Provide input:
{"domains": ["example.com"],"maxConcurrency": 1,"timePeriod": "Last 7 days","platform": "YOUTUBE"}
- Run the task
- Get results from the dataset
Performance and Limits
- Memory: 4096 MB
- Timeout: 4 hours
- Concurrency: 1-10 domains in parallel
- Rate limiting: 2 second delay between requests
Dependencies
- Python 3.9
- Chrome browser
- Tesseract OCR
- Key Python packages:
- selenium
- pytesseract
- aiohttp
- Pillow
- apify-client
Error Handling
The actor implements robust error handling:
- Automatic retries for transient errors
- Graceful degradation for OCR failures
- Detailed error reporting in output
- Progress tracking and statistics
Development
-
Install dependencies:
$pip install -r requirements.txt -
Install system dependencies:
$apt-get install tesseract-ocr -
Run locally:
$python main.py
License
Apache 2.0