Stop wasting your budget on slow, resource-heavy browser-based scrapers. This is the fastest, most cost-effective, and data-rich Google Maps scraper on Apify, designed for high-scale lead generation and market research.
All notable changes to this Actor are documented here. Versions follow the
MAJOR.MINOR scheme used by Apify builds.
[1.3] — 2026-05-05
Drop residential-proxy traffic for non-Google requests
Up to v1.2 every HTTP — including Nominatim/Photon geocoding and arbitrary
business-website fetches during enrichment — went through Apify residential
proxy ($0.0008/MB). Most of those calls don't need it: free public APIs
don't anti-bot, and small-business websites rarely block datacenter IPs.
Fix: the actor now creates two HTTP clients:
http — residential proxy (user-configured), used for Google search XHR
SSR fetches that genuinely need real-IP routing.
http_direct — no proxy, direct from the Apify worker. Used for
Nominatim/Photon geocoding and the website-enrichment fetches.
Net effect on a typical run with extractContactsFromWebsite=true:
60-90% of HTTP requests no longer use residential proxy. Estimated savings:
**$0.10-0.15 per 1 000 places in op cost**.
Edge case: if a business website blocks the Apify worker's datacenter IP
(rare), the enrichment for that one site silently skips (we already
max_retries=1 for fail-fast website fetches). Other places are unaffected.
[1.2] — 2026-05-05
Lower memory floor (128 MB) — cheaper runs
Memory footprint measured with psutil:
Light run (5 terms × subdivision × 415 places, no enrichment): peak 80 MB
Medium run (50 places + full website enrichment, concurrency=8): peak 109 MB
Both well under 128 MB. The previous minMemoryMbytes: 256 was unnecessarily
high — frugal users couldn't pick the cheapest tier. Updated:
minMemoryMbytes: 128 (was 256) — opt-in for small runs
maxMemoryMbytes: 4096 (unchanged) — for city-scale jobs
At 128 MB on Apify:
Compute cost ~50% lower vs 256 MB
Same throughput for ≤ 100-place runs
For city-scale (1000+ places) prefer 512 MB to stay safe
[1.1] — 2026-05-04
Critical fix: strict geographic match (drop places from wrong country)
A real production run searching karnataka, India for school /
high school / pre university etc. returned 120 places of which only 3
were actually in Karnataka. The rest:
80 places from Texas, USA (Arlington, Fort Worth, Dallas)
20 places from Cantabria, Spain
11 from Andhra Pradesh (neighbouring Indian state)
1 from South Korea, 1 from Cambodia, 4 from Tamil Nadu / Maharashtra
Two compounding bugs:
Subdivision math broke at low zoom. The previous formula gave
children a longitude offset of 360 / 2^z * 0.75 which at z=6 is 4.2°
— almost the full width of a typical state. Children's centers drifted
into the Arabian Sea and Bay of Bengal.
No bounding-box check. When Google's search XHR found nothing at the
drifted coordinates, it fell back to the residential-proxy IP's country
for results. Apify residential exits in Texas / Spain / Korea returned
schools in those regions instead of empty results.
Fixes:
Correct subdivision math: child longitude offset is now
360 / 2^(z+2) (= a clean quarter of the parent viewport), with
latitude scaled by cos(lat) for high-latitude correctness.
bbox capture + filter: Nominatim and Photon both expose the queried
region's bounding box. We now store it in Viewport.bbox, propagate it
to all subdivided children, and drop any place whose (lat, lng)
falls outside (with 0.5° tolerance for border cases).
New inputgeoStrictMatch: true (default ON). Set to false to
keep the v1.0 wider-area behavior.
Verified on the same karnataka, India query → only Karnataka places
returned, Texas/Spain/Korea results dropped.
[1.0] — 2026-05-03
Production launch. First stable, public-ready release of the HTTP-only
Google Maps scraper. No browser, no Chromium — just curl_cffi with Chrome
TLS impersonation.
Coverage
Quad-tree viewport subdivision. When a viewport saturates (≥18 of the
first 20 results are new), it splits into 4 child viewports at zoom+1 and
recurses up to maxSubdivisionDepth (default 4 → up to 256 viewports per
seed). This is what lets the Actor break Google's hard ~120-results-per-area
limit and scrape entire metro areas.
Multi-zoom expansion (multiZoomDelta) — search each seed at
zoom-N..zoom+N for +30-70% extra unique places.
Multi-language passes (additionalLanguages) — re-search the same
area in additional hl= codes to catch translations and regional categories.
Geo composite resolver — countryCode / state / county / city /
postalCode joined into a single Nominatim query when locationQuery is
empty.
Direct inputs — startUrls (/maps/place/... URLs) and placeIds
(raw ChIJ… IDs) bypass search entirely.
Output (~46 fields per place)
Place identifiers (placeId, fid, cid, kgmid), structured address
(addressParts.{street, city, state, postalCode, neighborhood, countryCode}),
center + entrance coordinates, contacts (phone, phoneUnformatted,
website, websiteDisplay), ratings (totalScore, reviewsCount for
hotels), opening status (openingHoursToday, currentStatus,
nextOpensAt, permanentlyClosed, temporarilyClosed), descriptions
(subTitle, description, longDescription), categories, owner info
(ownerName, ownerId, claimThisBusinessUrl), placeTags (LGBTQ+
friendly, women-owned, …), full additionalInfo amenities tree, imagesCount
thumbnail, menu URL, plusCode, locatedIn, isAdvertisement, hotel
block (hotelStars, hotelPrice, hotelCheckInDate/hotelCheckOutDate,
hotelAmenities), plus run metadata (scrapedAt, language, rank,
searchPageUrl).
Built-in filters (post-fetch, free)
placeMinimumStars — two / twoAndHalf / … / fourAndHalf.
skipClosedPlaces — drop permanently / temporarily closed.
searchMatching — all / only_includes / only_exact (title vs term).
categoryFilterWords — keep only matching categories.
Optional add-on: website-contacts enrichment
When extractContactsFromWebsite is enabled, the Actor visits each place's
website and extracts emails (with deobfuscation of foo (at) bar (dot) com
style writing), additionalPhones (from tel: links, normalized to E.164,
deduped against the main phone), and 8 social-media URL fields (facebooks,
instagrams, linkedIns, twitters, youtubes, tiktoks, pinterests,
whatsapps). Domain-level cache means chain stores share one fetch.
Optional /contact page fallback when the homepage yields no email. Big
global chains (McDonald's, Starbucks, Hilton, …) are skipped by default.
Quality filters tuned against real-world false positives:
Reject CMS-glued phone numbers (e.g. 60957293003 — 11 digits without +
and not starting with NANP 1 is junk).
Reject Facebook XML namespace URL (/2008/fbml) and bare profile.php
placeholders; keep only profile.php?id=NNN and vanity URLs.
Reject Pinterest conversion-tracking pixel (ct.pinterest.com/v3) and
any social handle matching API-version pattern (v1, v2, …).
Reject .php / .html / .aspx "vanity URLs" — real social handles
never carry file extensions.
Performance: enrichment runs in parallel within one task via
asyncio.gather. Fetches use max_retries=1 (fail-fast) since retrying a
403/timeout from a third-party site rarely helps — better to skip and move
on. Real platform measurement: 20 places + full enrichment in ~21 s on
residential proxy.
Reliability & speed
Sticky residential proxy sessions per viewport — all paginated XHRs of
one logical search hit the same Apify residential exit IP.
AsyncSession reuse across the pagination chain — single TLS handshake
per task, HTTP/2 multiplexed.
Chrome TLS impersonation rotated per session (Chrome 120/123/124/131
profiles).
EU consent flow bypassed via pre-set CONSENT=YES+cb and SOCS=…
cookies on every Google request — no more consent.google.com redirects.
BlockedError retry-with-fresh-IP — when a sticky session does get
challenged, the pipeline mints a brand-new session_id (different proxy
exit) and tries once more before giving up.
Bulletproof geocoding — Nominatim with 6 s cap, falling back to Photon
(komoot) which uses the same OpenStreetMap data through more reliable
infra. Cached in KV store under _geocode_cache so repeat runs are instant.
Consent / captcha / 429 detection with intelligent backoff;
fast-fail on deterministic 4xx (no retry storms).
Resumable across Apify migrations — state checkpointed every 30 s and
on PERSIST_STATE event.
Bounded concurrency worker pool with dynamic enqueue of subdivided
child viewports.
Pay-per-event monetization (3 events)
Event
When it fires
Suggested price
apify-default-dataset-item
every place pushed (Apify auto-charges)
$0.0010 ($1.00 / 1 000 results)
place-with-emails
website enrichment yielded ≥ 1 email
$0.0015 ($1.50 / 1 000)
place-with-socials
website enrichment yielded ≥ 1 social URL
$0.0005 ($0.50 / 1 000)
src/billing.py detects whether PPE is actually active for the current run
(via Configuration.actor_pricing_info) so local runs and free-tier runs
skip charging entirely — no log noise, no overhead.
Apify Store metadata
Input schema with grouped sections + select-list filters.