Google Images Scraper
Pricing
$19.99/month + usage
Google Images Scraper
Extract image results from Google Images using the Google Images Scraper. Collect image URLs, titles, source websites, thumbnails, and search result data automatically. Ideal for research, dataset creation, SEO analysis, and visual content discovery.
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapAPI
Actor stats
0
Bookmarked
2
Total users
0
Monthly active users
9 days ago
Last modified
Categories
Share
Google Images Scraper
Google Images Scraper is a headless, Playwright-powered Google image search scraper that extracts image URLs, thumbnails, titles, source pages, and site origins from Google Images results. It solves the manual effort of collecting visuals and metadata by automating Google Images browsing and streaming records directly to your Apify dataset. Built for marketers, developers, data analysts, and researchers, this Google Images scraper tool scales from one-off lookups to multi-keyword runs and integrates cleanly with the Apify platform and Google Images scraping API workflows.
What data / output can you get?
Below are the exact fields this actor saves to the dataset when you scrape Google Images results. You can export the dataset to JSON, CSV, or Excel from Apify.
| Data type | Description | Example value |
|---|---|---|
| query | Search term that produced the image result | "nature" |
| imageUrl | Direct link to the full image | "https://example.com/photo.jpg" |
| title | Title text associated with the result | "Stunning nature photography" |
| imageWidth | Full image width (pixels, if known) | 1600 |
| imageHeight | Full image height (pixels, if known) | 1200 |
| thumbnailUrl | Google Images thumbnail URL or derived preview | "https://encrypted-tbn0.gstatic.com/images?q=tbn:ABC123" |
| thumbnailWidth | Thumbnail width (pixels) | 300 |
| thumbnailHeight | Thumbnail height (pixels) | 200 |
| contentUrl | Source page URL that hosts the image | "https://example.com/gallery.html" |
| origin | Origin domain (host) of the source page or image | "example.com" |
Note: Width/height values may be 0 when not present on the result. As soon as each image is collected, it can be streamed to the Output dataset in real time or saved in a batch at the end.
Key features
-
⚡ Real-time dataset streaming When pushToDatasetRealtime is on, each image record is pushed immediately to your Apify dataset — perfect for live monitoring and pipelines that consume SERP image results as they appear.
-
🧭 Two-layout extraction for better coverage The actor collects from the standard Google Images layout (tbm=isch) and automatically retries with an alternate layout (udm=2) if needed to maximize recall per query.
-
🔍 Smart pagination + infinite scroll Combines page-by-page loading with controlled scrolling to merge additional thumbnails and extract more results per Google image search scraper run.
-
🔁 Built-in retries and de-duplication Per-query retry logic and duplicate filtering ensure clean results with unique image/thumbnail pairs.
-
👻 Headless Chrome/Chromium Runs a Playwright browser (Chrome channel when available, otherwise bundled Chromium). Headless mode is enabled by default for speed and stability.
-
🔐 Optional Apify Proxy Configure proxy groups/countries with proxyConfiguration to improve resilience when networks are restrictive or when geo checks appear.
-
🧩 Developer-friendly and automation-ready Use via the Apify UI or programmatically with the Apify API to power Google Images scraping API workflows in Python, Node.js, or your orchestration stack.
-
🗂️ Flexible exports Access results from the Apify dataset and download to JSON, CSV, or Excel for analysis, enrichment, or downstream “automated Google image downloader” steps.
How to use Google Images Scraper - step by step
-
Sign in to Apify Create a free Apify account and navigate to the Google Images Scraper actor.
-
Add your queries In the Input, paste one or more search phrases into queries (one per line). Example: “nature”, “product shots”, “logos”.
-
Set result limits Adjust maxImages to control how many unique images you collect per query (1–1000).
-
Choose live vs. batch output
- pushToDatasetRealtime: true streams each image to the Output dataset as soon as it’s collected.
- Set it to false if you prefer all rows to upload once the run finishes.
-
Configure headless and proxy (optional)
- headless defaults to true for cloud runs.
- Use proxyConfiguration to route through Apify Proxy when you encounter rate limits or geography checks.
-
Run the scraper Click Start. The actor opens a Playwright browser, navigates Google Images, and begins collecting image metadata.
-
Monitor progress Watch the run log for live status messages and see rows appear in the Dataset tab (if live push is on).
-
Export results Open the Dataset tab to download your results in JSON, CSV, or Excel — ready for SEO analysis, catalogs, or a Google Images bulk download pipeline downstream.
Pro tip: Trigger this actor via the Apify API to build a repeatable Google Images extractor workflow in your Python or Node.js stack.
Use cases
| Use case name | Description |
|---|---|
| AI/ML dataset creation | Build large, diverse image metadata corpora by keyword to seed model training or labeling pipelines. |
| Visual SEO analysis | Track SERP image placements and extract titles, origins, and content URLs for trend analysis and optimization. |
| Brand monitoring | Discover where logos and product imagery appear online by collecting origin domains at scale. |
| E-commerce enrichment | Gather product-related visuals and source pages to support cataloging and competitive research. |
| Academic & market research | Collect structured Google Images metadata for studies, reports, and exploratory analysis. |
| API pipeline integration | Call the actor via Apify API in scheduled or event-driven workflows to feed BI tools or storage. |
Why choose Google Images Scraper?
This production-ready Google Images crawler focuses on reliability, scale, and clean metadata output.
- 🎯 Accurate extraction logic that merges embedded data and link-based results for higher coverage.
- 🚀 Scales per query with smart pagination and infinite-scroll merging while avoiding duplicates.
- 🔌 API-ready for Google Images scraping API use, with easy integration into Python or Node.js pipelines.
- 🔐 Optional Apify Proxy support to handle tougher networks and reduce verification interruptions.
- 🧱 Stable infrastructure with headless Playwright and controlled retries for consistent runs.
- 💾 Real-time or batch saves so you can monitor live or push everything at the end — your choice.
- 🧰 Better than brittle extensions: no browser plugins or manual steps, just repeatable automation.
Is it legal / ethical to use Google Images Scraper?
Yes — when used responsibly. This actor collects publicly available metadata (e.g., image URLs, titles, source pages, and origins) from Google Images results. It does not require login and does not access private or authenticated content.
Guidelines:
- Only use publicly available data and respect Google’s Terms of Service.
- Comply with data protection laws (e.g., GDPR, CCPA) and your organization’s policies.
- If you intend to download or reuse images, ensure you have permission or rely on fair-use and proper licensing.
- Consult your legal team for edge cases or commercial use.
Input parameters & output format
Example JSON input
{"queries": ["nature","product shots"],"maxImages": 25,"proxyConfiguration": {"useApifyProxy": false},"headless": true,"pushToDatasetRealtime": true}
Input fields
- queries (array, required)
- Description: List of search phrases. One query per line in the editor.
- Default: ["nature"]
- maxImages (integer)
- Description: Cap how many unique images to keep for each keyword (1–1000).
- Minimum: 1; Maximum: 1000
- Default: 10
- proxyConfiguration (object)
- Description: Pick proxy groups/countries in the picker. When enabled, the actor uses your selection for the browsing session.
- Default: {"useApifyProxy": false}
- headless (boolean)
- Description: On (default): best for Apify cloud. Off: useful for debugging locally.
- Default: true
- pushToDatasetRealtime (boolean)
- Description: ON: each image is pushed to the dataset immediately. OFF: everything uploads once at the end.
- Default: true
Example JSON output
[{"query": "nature","imageUrl": "https://example.com/images/forest.jpg","title": "Misty forest at sunrise","imageWidth": 1600,"imageHeight": 1200,"thumbnailUrl": "https://encrypted-tbn0.gstatic.com/images?q=tbn:EXAMPLE","thumbnailWidth": 300,"thumbnailHeight": 200,"contentUrl": "https://example.com/blog/misty-forest","origin": "example.com"}]
Notes:
- The actor pushes each record to the dataset with the exact keys above.
- Width and height may be 0 if the result does not include dimensions.
- Export your dataset to JSON, CSV, or Excel from the Apify UI or API.
FAQ
Does this tool download images or just collect URLs and metadata?
It collects image URLs, thumbnails, titles, source pages, and site origins. You can then use those URLs with a separate automated Google image downloader if you need the actual files.
How many images per query can it collect?
You control this with maxImages. The allowed range is 1–1000 per query. The actor uses pagination and scrolling to approach your target.
Can I stream results live to the Output dataset?
Yes. Set pushToDatasetRealtime to true to push each image immediately. Set it to false to upload all records in one batch at the end.
Does it support proxies?
Yes. Use proxyConfiguration to enable Apify Proxy. This helps when you encounter rate limits, blocks, or geo checks while you scrape Google Images results.
What fields are included in the output?
Each record includes: query, imageUrl, title, imageWidth, imageHeight, thumbnailUrl, thumbnailWidth, thumbnailHeight, contentUrl, and origin.
Is this a Google Images scraping API?
This is an Apify actor that you can run via the Apify API, making it suitable for use as a Google Images scraping API endpoint in your workflows.
Is it built with Python or Node.js?
The actor is implemented with Python and Playwright under the hood, but you can trigger it from any stack (including Node.js) using the Apify API.
What happens if Google challenges or blocks a request?
The actor detects challenge pages and logs a warning. You can enable Apify Proxy for better resilience and rerun the query. It also includes retry logic per query.
Closing CTA / Final thoughts
Google Images Scraper is built to automate Google image search extraction into clean, structured datasets. With real-time streaming, optional Apify Proxy, and stable Playwright automation, it’s ideal for marketers, developers, analysts, and researchers who need image URLs, thumbnails, and source metadata at scale. Integrate it via the Apify API to power your Google Images extractor workflows in Python or Node.js — and start extracting smarter, faster, and more reliably today.


