Google Patents Scraper — Search, Citations, Family Graph
Pricing
from $2.00 / 1,000 patent record (basic)s
Google Patents Scraper — Search, Citations, Family Graph
Scrape 120M+ patents from USPTO, EPO, WIPO, JPO, CN, KR + 100 offices. Six modes: search, details (claims/citations/family), byAssignee, byInventor, family graph, citationNetwork. Pay-per-event, no API key. Built for prior-art search, IP landscaping, and AI-agent use via Apify MCP.
Pricing
from $2.00 / 1,000 patent record (basic)s
Rating
0.0
(0)
Developer
Khadin Akbar
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Google Patents Scraper — Patent Search, Citations, Family Graph, Assignee/Inventor Portfolios
Scrape 120M+ patents from USPTO, EPO, WIPO, JPO, China, Korea + 100 other patent offices via Google Patents — no API key required. Search by query, fetch full details, walk family graphs across jurisdictions, crawl N-hop citation networks, or pull every patent owned by a company or filed by an inventor.
Built for prior-art search, IP landscaping, competitive monitoring, citation-network analysis, and AI-agent consumption via Apify MCP.
What you get
| Field | Always | Deep only | Notes |
|---|---|---|---|
patentId | ✅ | Canonical, e.g. US10000000B2 | |
title, abstract, assignee, inventors[] | ✅ | ||
filingDate, publicationDate, grantDate, priorityDate | ✅ | ISO 8601 | |
status, type, countryCode, kindCode | ✅ | GRANT/APPLICATION, PATENT/DESIGN/OTHER | |
googlePatentsUrl, pdfUrl | ✅ | Direct links | |
cpc[] | ✅ | Cooperative Patent Classification codes | |
claims[], claimsCount | ✅ | Every claim, plain text | |
citationsBackward[], citationsForward[] | ✅ | Full citation lists (orig + family) | |
familyId, familyMembers[] | ✅ | All jurisdictions of the same invention | |
citationHop, citationSeed | citationNetwork mode | Distance from seed in the citation graph |
Pricing: $0.002 per basic record · $0.005 per deep record · $0.00005 actor start. No setup fees. No rental.
Modes
| Mode | When to use | Required input |
|---|---|---|
search (default) | Free-text query with filters | searchQuery |
details | You have known patent IDs and want full data | patentIds[] |
byAssignee | All patents owned by a company | assigneeName |
byInventor | All patents by a person | inventorName |
family | All jurisdictions of one invention starting from any seed | patentIds[] |
citationNetwork | N-hop forward/backward citation crawl | patentIds[], citationDirection, citationDepth |
Quick examples
1. Prior-art search for autonomous-vehicle ML
{"mode": "search","searchQuery": "machine learning autonomous vehicle perception","dateFrom": "2020-01-01","status": "GRANT","maxResults": 100,"enrichmentDepth": "basic"}
Returns 100 granted patents, basic fields. Cost: ~$0.20.
2. Full detail for a known patent
{"mode": "details","patentIds": ["US10000000B2", "EP3000000B1"]}
Returns 2 fully enriched records with claims, citations, family. Cost: ~$0.01.
3. Apple's machine-learning portfolio (US-only)
{"mode": "byAssignee","assigneeName": "Apple Inc.","searchQuery": "machine learning","countryCodes": ["US"],"maxResults": 500,"enrichmentDepth": "basic"}
Cost: ~$1.00.
4. Geoffrey Hinton's deep-learning patents
{"mode": "byInventor","inventorName": "Geoffrey Hinton","maxResults": 200,"enrichmentDepth": "deep"}
Cost: ~$1.00.
5. Worldwide family of one invention
{"mode": "family","patentIds": ["US10000000B2"]}
Returns the seed plus every jurisdictional family member (JP, EP, WO, etc.).
6. 2-hop forward citation network
{"mode": "citationNetwork","patentIds": ["US10000000B2"],"citationDirection": "forward","citationDepth": 2,"maxResults": 500}
Crawls every patent that cites the seed, then every patent that cites those — capped at 500 records.
Pricing
Pay-per-event. Three events, capped:
| Event | Price | Triggered by |
|---|---|---|
apify-actor-start | $0.00005 | Container start (auto, per 1 GB RAM) |
patent-found | $0.002 | Each record returned with basic fields |
patent-detailed | $0.005 | Each record returned with deep fields (claims/citations/family/CPC) |
details, family, and citationNetwork modes always charge as patent-detailed (deep fetch is required to extract citations and family). search, byAssignee, byInventor charge patent-found by default and switch to patent-detailed only when enrichmentDepth: "deep".
Typical run costs:
- Prior-art search, 50 patents, basic = $0.10
- Landscape sweep, 1,000 patents, basic = $2.00
- Deep enrichment, 100 records = $0.50
- 3-hop citation network, ~500 patents = $2.50
How it works
The actor hits Google Patents' public XHR query endpoint for search and the standard detail page for full records. No API key. Residential proxies are enabled by default — Google throttles datacenter IPs and returns HTTP 500 "Sorry" on filtered queries, so this is mandatory for reliable runs.
Reliability features built in:
- Canary URL check at run start (
US10000000B2) - Session pool with same proxy + same UA + same cookie jar across requests
- Retry with backoff on 429/5xx + per-status logic
- Circuit breaker — aborts the run if the failure rate exceeds 50% over the last 20 requests, to preserve your proxy budget
- Multi-fallback selectors for every extracted field (so a Google layout change doesn't break the whole run)
- Bot-block detection — recognizes the "Sorry" page and rotates session
- Graceful degradation — when one selector fails, others fill in
MCP / AI-agent usage
The actor is exposed as apify--google-patents-scraper in Apify MCP. Drop it into Claude Desktop / Cursor / any MCP-compatible client and ask things like:
- "Find prior art for US10000000B2 — give me 50 backward citations with claims."
- "What patents has OpenAI filed since 2022?"
- "List every jurisdiction of Apple's M1 chip patent (start from US10916298B2)."
- "Build a 2-hop forward citation network from Tesla's autopilot patent."
The agent picks the right mode, fills the right input, and walks away with structured JSON.
Code examples
JavaScript (Apify Client)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('khadinakbar/google-patents-scraper').call({mode: 'search',searchQuery: '"large language model" training',dateFrom: '2023-01-01',maxResults: 100,enrichmentDepth: 'basic',});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Found ${items.length} patents`);
Python (apify-client)
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("khadinakbar/google-patents-scraper").call(run_input={"mode": "byAssignee","assigneeName": "OpenAI","maxResults": 200,"enrichmentDepth": "deep",})items = list(client.dataset(run["defaultDatasetId"]).iterate_items())print(f"OpenAI has {len(items)} patents in the dataset")
Output shape
Each item is flat (≤500 tokens, agent-friendly):
{"patentId": "US10000000B2","title": "Coherent LADAR using intra-pixel quadrature detection","abstract": "A frequency modulated (coherent) laser detection and ranging system...","filingDate": "2015-03-10","publicationDate": "2018-06-19","grantDate": "2018-06-19","priorityDate": "2015-03-10","status": "GRANT","type": "PATENT","countryCode": "US","kindCode": "B2","assignee": "Raytheon Co","inventors": ["Joseph Marron"],"cpc": ["G01S7/483", "G01S7/4911", "G01S17/32", "G01S17/89"],"claimsCount": 20,"claims": ["1. A laser detection and ranging (LADAR) system, comprising...", "..."],"citationsBackward": ["US5093563A", "US5751830A", "..."],"citationsForward": ["US10845468B2", "US20180172806A1", "..."],"familyId": "55456961","familyMembers": ["JP2018510362A", "JP6570658B2", "WO2016144528A1", "..."],"pdfUrl": "https://patentimages.storage.googleapis.com/c0/d5/f7/86ad5b42759506/US10000000.pdf","googlePatentsUrl": "https://patents.google.com/patent/US10000000B2/en","mode": "details","scrapedAt": "2026-05-03T19:22:00.000Z"}
FAQ
Q: Does it cover non-US patents? Yes. Google Patents indexes 120M+ docs from 100+ offices: USPTO (US), EPO (EP), WIPO (WO), JPO (JP), CNIPA (CN), KIPO (KR), UKIPO (GB), DPMA (DE), INPI (FR), CIPO (CA), IP Australia (AU) and many more.
Q: Can I get full claims and citations from search results?
Set enrichmentDepth: "deep" on search/byAssignee/byInventor, and the actor will fetch each patent's detail page (one extra HTTP per record). For known IDs, just use mode: "details" — always returns the full record.
Q: How does the citation-network mode work?
You pass one or more seed patentIds. The actor fetches each seed's detail page, follows the requested citation direction (backward/forward/both), and queues each cited patent for the next hop. Depth caps the chain (1, 2, or 3 hops). maxResults caps the total nodes — set it before launching a 3-hop crawl on a heavily-cited patent.
Q: My run returned 0 patents. What's wrong?
- Verify
searchQuerysyntax — try the same query atpatents.google.comfirst - Date range too narrow? Remove
dateFrom/dateTo - Country filter excludes everything? Drop
countryCodes - For
details/family/citationNetwork: confirmpatentIdsare valid (Google Patents URL format, e.g.US10000000B2, no spaces or punctuation)
Q: How do I avoid Google blocking my run? Keep the default residential proxy. Datacenter IPs get the "Sorry" page on filtered queries. The actor automatically rotates sessions on 429/403 and includes a circuit breaker.
Q: What's the difference between priority_date and filing_date?
priority_date is the earliest filing claimed (typically the original home-country application). filing_date is when this specific publication was filed. They differ for international filings.
Q: Is this legal? Patent data is public-record data. Google Patents indexes published patents from public patent offices and Google's terms allow accessing this content. We respect rate limits and never bypass authentication walls. You are responsible for compliance with Google's Terms of Service and your local laws. This actor is provided for legitimate research, IP analysis, and educational use.
Pair with these actors
This actor is one of 50+ research and lead-generation actors in the Khadin Akbar portfolio:
scrape-google-serp— Google SERP results for keyword researchgoogle-news-scraper— track news mentions of patentsgoogle-trends-scraper— interest trends for technology areasai-search-brand-monitor— monitor brand/IP mentions across AI surfaceslinkedin-profile-email-scraper— find inventor / counsel contacts
Changelog
- 1.0 (2026-05-03) — Initial release: search, details, byAssignee, byInventor, family, citationNetwork. Pay-per-event, residential proxy default, canary check, circuit breaker.