Google Ads Scraper avatar

Google Ads Scraper

Try for free

1 day trial then $30.00/month - No credit card required now

Go to Store
Google Ads Scraper

Google Ads Scraper

silva95gustavo/google-ads-scraper
Try for free

1 day trial then $30.00/month - No credit card required now

Extract text, image and video ads from Google Ads, scraped from the ad library provided by Google Ads Transparency Center. Gain access to ad details, ad copy, locations, and more. Dive deeper into the Google Ads Transparency Center for a competitive edge.

KD

Feature request

Open

kdamsgaard opened this issue
20 days ago

Feature request:

First of all, I would like to commend you on building and maintaining such an excellent scraper. However, I do believe these features would make it even more versatile and comprehensive:

Enable scraping of specific creative URLs, such as: https://adstransparency.google.com/advertiser/AR14861514213498552321/creative/CR05598963089832673281 Currently, while most data is retrieved, certain fields like Advertiser Name and Preview URL remain unpopulated.

Support for Feed Image Ads Ensure proper scraping of product/feed-based ads. Currently, variations lack image links and text content for these ad types.

Logo Scraping Add support for scraping the logo used in ad creatives.

Support for downloading images. When specifying to download assets, regular images should be downloaded to key-value store as well.

Support for age restricted ads. I have seen you mention in another thread that you don't plan to support age restricted ads again since they require login. I urge you to reconsider this. Collecting all ads from the transparency center, including the age restricted ones, is important for a complete and comprehensive collection. I'm sure many users besides myself would appreciate this feature.

Thank you again for your dedication to maintaining and improving this tool!

silva95gustavo avatar

Hi kdamsgaard,

Thank you for taking the time to share such detailed feedback and feature suggestions. I’m glad to hear the scraper has been useful to you!

I’m currently reviewing your points and experimenting with some of the ideas you’ve outlined. I’ll provide an update on the progress very soon.

Your support and input are greatly appreciated!

silva95gustavo avatar

Thank you again for your detailed feedback and feature suggestions! I’ve reviewed each point, and here’s an update on where things stand:

  1. Advertiser Name This is now resolved. The scraper will always populate the advertiser name going forward.

  2. Preview URL This is also done! You can now enable this functionality using the new input parameter shouldDownloadPreviews.

  • Description: Defines if the scraper should download creative previews as an HTML file, ensuring that the output field previewUrl is always populated. Enabling this option may slightly increase costs due to writes to a key-value store.
  1. Support for Feed Image Ads To investigate further, could you share one or more example URLs? This will help me determine if the requested information can be fetched.

  2. Downloading Images

  • The parameter shouldDownloadAssets currently enables downloading of videos when their original Google URLs have a short expiry time. These videos are stored in Apify Storage to ensure they remain accessible even after the original links expire.
  • Why not images yet? So far, I haven’t encountered images with short expiry times, which is why the scraper currently links directly to the Google servers for images. If you’ve found any failing scenarios where images are not accessible due to expired URLs, I’d greatly appreciate it if you could share the details with me. I’ll promptly address the issue to ensure images behave similarly to videos.

(continues...)

silva95gustavo avatar
  1. Logo Scraping Could you share how you’re currently finding logos for ad creatives? From what I’ve seen, the vast majority of ads do not include logos, so it would be very helpful to know of any cases or examples where logos are accessible.

  2. Support for Age-Restricted Ads Unfortunately, I’ve decided not to implement this functionality. Here’s why:

  • Accessing age-restricted ads requires being logged in, which could violate Google’s Terms of Service.
  • Adding this feature could pose a risk of account termination for users, which I want to avoid.

That said, if you’d like to proceed at your own discretion, you can customize the requests made by the scraper to include your session cookies. Here’s an example of how to pass custom headers in the input:

1{
2    "startUrls": [
3        {
4            "headers": {
5                "cookie": "<insert_cookies>",
6                "x-framework-xsrf-token": "<insert_token>"
7            },
8            "url": "https://adstransparency.google.com/advertiser/AR08888592736429539329?authuser=0&region=ES&preset-date=Last+7+days"
9        }
10    ]
11}

How to Configure:

  • If you’re running the scraper from the Apify Console (UI), switch from "Manual" to "JSON" input mode.
  • Add your headers under the headers object, as shown above.
  • Replace <insert_cookies> and <insert_token> with your session cookies and any required tokens.

Please keep in mind that using this approach carries the risks mentioned earlier, so proceed with caution.

KD

kdamsgaard

16 days ago

Hi Gustavo,

Thank you for your quick reply and for taking prompt action on these features. Your responsiveness is greatly appreciated.

  1. Advertiser Name Perfect. Is working in the tests I have run so far.

  2. Preview URL Ok. Can we just get the URL in the response? Is it necessary to use the key-value store and store an html file?

  3. Support for Feed Image Ads Here is an example of a feed ad: https://adstransparency.google.com/advertiser/AR10800287696400941057/creative/CR00436047516799074305?region=DK Btw. all 3 variations also has logo in them.

Scraper captures no images or text from this type of ad. It correctly identifies 3 variations, but results are all empty:

1{
2    "advertiserId": "AR10800287696400941057",
3    "advertiserName": null,
4    "creativeId": "CR00436047516799074305",
5    "format": "IMAGE",
6    "previewUrl": null,
7    "regionStats": [
8        {
9            "regionCode": "DK",
10            "regionName": "Denmark",
11            "firstShown": "2024-01-09",
12            "lastShown": "2024-06-12",
13            "impressions": {
14                "lowerBound": 5000,
15                "upperBound": 6000
16            }
17        }
18    ],
19    "variations": [
20        {},
21        {},
22        {}
23    ]
24},

(continues...)

KD

kdamsgaard

16 days ago
  1. Downloading Images "I haven't encountered any issues with the expiration of ad images, but for my use case, I need to download these images. That's why i wanted to utilise the scraper so I don't risk being blocked. I have found another actor: https://console.apify.com/actors/SEQBnEA5oe2R9Hgdj/input which can download images from the datasets and upload directly to an s3 bucket, which is perfect for my needs. I will explore that option.

  2. Logo Scraping Here are some examples of ads with logos:

https://adstransparency.google.com/advertiser/AR12571208540535914497/creative/CR12201727416187486209?region=DK https://adstransparency.google.com/advertiser/AR12571208540535914497/creative/CR16472022770771820545?region=anywhere https://adstransparency.google.com/advertiser/AR10800287696400941057/creative/CR16262988378922811393?region=anywhere https://adstransparency.google.com/advertiser/AR10800287696400941057/creative/CR14761203568673292289?region=DK

For text ads the img html-tags appear to have an id of "stsuidi3"

  1. Support for Age-Restricted Ads I understand the challenges in scraping age-restricted ads and appreciate your caution. Could you provide more details on the risk of account termination? Specifically, will it be my Apify account that is at risk?
silva95gustavo avatar
  1. We create a file in the key-value store because a preview link is not always available from Google; it depends on the ad format and type.

  2. Thanks for pointing this out! I’ve updated the scraper to extract the text, logo, and click URL for this ad type. However, the scraper does not include images for this specific format because they represent various products, and the list of products changes dynamically based on browser cookies. Since the images are not consistent, I’ve chosen not to include them in the output dataset to avoid providing unreliable data.

  3. Thank you for sharing these examples! I’ve now implemented logo extraction support for all the cases you provided—check the logoUri property in the output.

  4. Using cookies from a Google account while scraping carries significant risks, including, but not limited to:

I recommend reviewing this article for a better understanding of the legality of web scraping: https://blog.apify.com/is-web-scraping-legal/. Additionally, it’s a good idea to consult a lawyer to discuss your specific use case.

Developer
Maintained by Community

Actor Metrics

  • 71 monthly users

  • 35 stars

  • >99% runs succeeded

  • 16 hours response time

  • Created in Oct 2023

  • Modified 15 days ago