Pricing

from $5.99 / 1,000 results

Google Patents Scraper

🔎 Google Patents Scraper extracts structured patent data from Google Patents — titles, abstracts, inventors, assignees, CPC/IPC, citations, claims, dates & PDFs. ⚡ Fast, reliable, and bulk-ready for IP research, competitive intel & R&D landscaping. 📊 CSV/JSON/API.

Pricing

from $5.99 / 1,000 results

Rating

0.0

(0)

Developer

Scrapier

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Google Patents Scraper

The Google Patents Scraper is a fast, reliable Google Patents data scraper that extracts structured patent records at scale — titles, abstracts, inventors, assignees, dates, links, PDFs, and optional deep fields like claims, citations, CPC/IPC, and patent family. It solves the repetitive, error-prone work of copy-paste by automating Google Patents data extraction via Google’s xhr/query and xhr/result endpoints, making it ideal for marketers, developers, data analysts, and researchers who need a production-ready Google Patents crawler for bulk runs. With CSV/JSON output and Apify API access, this Google Patents scraping tool enables end-to-end workflows — from quick lookups to large datasets for competitive intel, IP monitoring, and R&D landscaping.

What data / output can you get?

Data field	Description	Example value
patentNumber	Publication number parsed from search results	"US12438891B1"
title	Cleaned patent title text	"Systems and methods for machine learning–based inference"
abstract	Cleaned snippet/abstract from results	"A system includes a model trained to..."
inventors	List of inventor names	["Jane Doe", "John Smith"]
assignee	Assignee/owner from results	"Example Corp."
filingDate	Filing date (YYYY-MM-DD)	"2021-06-01"
publicationDate	Publication date (YYYY-MM-DD)	"2023-09-14"
grantDate	Grant date when available (YYYY-MM-DD)	"2024-05-10"
url	Canonical Google Patents URL	"https://patents.google.com/patent/US12438891B1"
pdfUrl	Direct link to the patent PDF on Google’s storage	"https://patentimages.storage.googleapis.com/US12438891B1.pdf"
scrapedAt	ISO-8601 timestamp of extraction (UTC, Z)	"2026-04-27T05:27:42Z"
classifications.cpc	CPC codes (when classifications enrichment is enabled)	["G06N20/00"]
classifications.ipc	IPC codes (derived when present; may be empty)	["G06N 20/00"]
citations.citedBy	Forward citations (when citations enrichment is enabled)	["US2020123456A1", "EP3456789B1"]
citations.references	Backward citations (when citations enrichment is enabled)	["US9876543B2"]

Bonus/optional fields when enrichment is toggled on:

fullText.description, fullText.claims — adds long-form description and claims text.
patentFamily — list of related publications in the same family.

Exports are available via the Apify Dataset in CSV or JSON, and accessible through the Apify API for automations and pipelines.

Key features

⚡️ Smart proxy ladder & resilience
Automatically tries direct requests first, then falls back to Apify datacenter proxies and finally residential proxies (up to 3 attempts). Successful responses “stick” to the working proxy tier for subsequent requests.
🧠 OR-merged query builder
Combines keywords, publication numbers, and q= terms from search URLs into a single OR-union so you can scrape Google Patents broadly without missing results when mixing inputs.
📦 Enrichment on demand
Toggle includeFullText, includeClaims, includeCitations, includePatentFamily, and includeClassifications to tailor each run. Get exactly the depth you need for IP research and Google Patents text mining.
📈 Bulk-ready pagination
Handles search pagination while respecting your maxResults cap — perfect for Google Patents bulk download workflows and building large datasets.
💾 CSV/JSON and API-friendly
Export data to CSV or JSON and access programmatically via the Apify API — ideal for developers and teams building ingestion pipelines or a Google Patents API alternative.
📎 PDF links included
Each record includes pdfUrl, enabling Google Patents PDF downloader workflows in your own system.
💻 Python-powered for reliability
Built on Python (aiohttp + Apify SDK), with concurrent detail fetching for efficient Google Patents data extraction.

How to use Google Patents Scraper - step by step

Sign in to your Apify account.
Open the “Google Patents Scraper” actor.
Add input data:
- Use urls for patent page links, search links with q=, or plain keyword lines (plain lines are treated as search phrases).
- You can also set searchQuery and/or patentNumbers — all inputs are OR-merged so you get a broad combined result.
Narrow results (optional): set assignee, inventor, country, dateFrom/dateTo (absolute or relative like “30 days”), and patentType to filter your query.
Set limits and enrichment: choose maxResults (use 0 for no cap), and toggle includeFullText, includeClaims, includeCitations, includePatentFamily, includeClassifications.
Configure proxyConfiguration only if your workspace requires Apify Proxy for large or repeated runs.
Run the actor. It paginates results and logs progress. If enrichment is enabled, details load in parallel and rows appear in the dataset as soon as they’re ready.
Download your dataset as CSV or JSON or connect via the Apify API for downstream processing and Google Patents CSV export workflows.

Pro Tip: Automate end-to-end pipelines by pulling results via the Apify API into your Python scripts for further analysis, modeling, or Google Patents text mining.

Use cases

Use case	Description
Competitive intelligence for R&D	Analyze assignees, claims, and citations to map competitor focus areas and technology trajectories.
Patent landscaping for IP teams	Build comprehensive datasets by country, date range, and document kind to identify white spaces and clusters.
Academic research & text mining	Enable corpus creation with abstracts, optional descriptions, and claims for NLP pipelines.
Rapid patent metadata extraction	Extract titles, inventors, dates, and links at scale for dashboards and reporting.
Google Patents search results scraper	Turn keyword searches into structured datasets without manual export.
Dataset creation & bulk download	Use maxResults and CSV/JSON export for Google Patents dataset download workflows.
PDF library building	Leverage pdfUrl to fetch and archive PDFs in your own storage, tied to each patent record.

Why choose Google Patents Scraper?

The Google Patents Scraper is built for precision, scale, and real-world reliability.

🎯 Structured accuracy: Normalizes Google’s xhr/query results into clean fields for analytics and modeling.
🔀 Smarter queries: OR-merge keywords, search URLs, and publication numbers to avoid missed matches.
🚀 Scalable runs: Handles pagination and parallel detail loading for large batches.
🛠️ Developer access: Export CSV/JSON and integrate via the Apify API — great for Google Patents scraper Python workflows.
🛡️ Robust connectivity: Direct-first requests with automatic fallback to datacenter and residential proxies.
✅ Public-data focus: Designed to extract publicly available records from patents.google.com responsibly.
💸 Operational efficiency: Reduce manual effort and streamline Google Patents data extraction across teams.

In short: a production-grade Google Patents crawler that outperforms brittle, manual, or extension-based alternatives.

Is it legal / ethical to use Google Patents Scraper?

Yes — when used responsibly. This actor extracts public data from patents.google.com and does not access private or authenticated content.

Guidelines for compliant use:

Use only publicly available information and respect Google’s terms of service.
Ensure your usage aligns with applicable data protection laws (e.g., GDPR, CCPA).
Avoid collecting or processing personal data beyond what is publicly provided in patent records.
Validate your specific use case with your legal team, especially for redistribution or commercial reuse.

Input parameters & output format

Example JSON input

{
  "urls": [
    "https://patents.google.com/patent/US12438891B1",
    "machine learning",
    "https://patents.google.com/?q=graph+neural+network"
  ],
  "searchQuery": "computer vision",
  "patentNumbers": ["EP1234567B1", "WO2020123456A1"],
  "assignee": "Example Corp",
  "inventor": "Jane Doe",
  "country": "US",
  "dateFrom": "6 months",
  "dateTo": "",
  "patentType": "ANY",
  "maxResults": 25,
  "includeFullText": false,
  "includeClaims": true,
  "includeCitations": true,
  "includePatentFamily": true,
  "includeClassifications": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

Parameters

urls (array)
Description: One entry per line: patent URLs, search URLs (q=), or plain keywords. Plain lines are search phrases. All entries OR-merge with the Keywords field.
Default: n/a (prefill shown in UI). Required: No
searchQuery (string)
Description: Main keyword search. OR-merged with any q= from search URLs and with publication numbers from patent URLs.
Default: ""
Required: No
patentNumbers (array)
Description: Specific patent IDs. Combined with keywords using OR.
Default: []
Required: No
assignee (string)
Description: Focus on patents owned by a particular organization.
Default: ""
Required: No
inventor (string)
Description: Find patents listing a specific inventor.
Default: ""
Required: No
country (string)
Description: Patent office / region filter (e.g., US, EP, WO) or ANY to search everywhere.
Default: "ANY"
Required: No
dateFrom (string)
Description: Published after — absolute (YYYY-MM-DD) or relative (e.g., 30 days, 6 months).
Default: ""
Required: No
dateTo (string)
Description: Published before — absolute or relative.
Default: ""
Required: No
patentType (string)
Description: Limit to grants, applications, or designs — or ANY for all types.
Default: "ANY"
Required: No
maxResults (integer)
Description: Cap how many patents to collect; use 0 for no limit.
Default: 10
Required: No
includeFullText (boolean)
Description: Adds the full written description (large text).
Default: false
Required: No
includeClaims (boolean)
Description: Adds the patent claims text.
Default: true
Required: No
includeCitations (boolean)
Description: Adds backward and forward citation lists where available.
Default: true
Required: No
includePatentFamily (boolean)
Description: Adds related publications in the same family.
Default: true
Required: No
includeClassifications (boolean)
Description: Adds CPC/IPC classification codes.
Default: true
Required: No
proxyConfiguration (object)
Description: Optional Apify Proxy settings. Leave default off if you don’t need it.
Default: {} (UI may prefill useApifyProxy)
Required: No

Note: At least one of urls, searchQuery, or patentNumbers must yield a query; otherwise the run exits with a helpful log message.

Example JSON output

{
  "patentNumber": "US12438891B1",
  "title": "Systems and methods for machine learning–based inference",
  "abstract": "A system includes a model trained to...",
  "inventors": ["Jane Doe", "John Smith"],
  "assignee": "Example Corp.",
  "filingDate": "2021-06-01",
  "publicationDate": "2023-09-14",
  "grantDate": "2024-05-10",
  "classifications": {
    "cpc": ["G06N20/00", "G06F17/18"],
    "ipc": ["G06N 20/00", "G06F 17/18"]
  },
  "url": "https://patents.google.com/patent/US12438891B1",
  "pdfUrl": "https://patentimages.storage.googleapis.com/US12438891B1.pdf",
  "scrapedAt": "2026-04-27T05:27:42Z",
  "citations": {
    "citedBy": ["EP3456789B1"],
    "references": ["US9876543B2", "WO2020123456A1"]
  },
  "fullText": {
    "claims": "1. A method comprising ...",
    "description": "In some implementations, the system comprises ..."
  },
  "patentFamily": ["EP1234567B1", "WO2020123456A1"]
}

Notes:

Optional fields (fullText, patentFamily) appear only when the corresponding include* flags are enabled.
Some fields may be empty strings or empty arrays when not available from the source or when detail pages can’t be retrieved (the item is still saved with available data).

FAQ

Do I need to log in or provide cookies to scrape Google Patents?

No. The actor works with publicly available endpoints on patents.google.com and does not require login or cookies. It uses direct requests first and only falls back to proxies if needed.

Can I use this with Python or an API?

Yes. Results are stored in an Apify Dataset, which you can access via the Apify API. This makes it easy to integrate into Google Patents scraper Python workflows or downstream automation.

How many patents can I scrape in one run?

You control this with maxResults. Set a specific number to cap output or use 0 for no limit. The actor paginates results automatically and enriches details in parallel when requested.

Can I export to CSV or JSON?

Yes. You can export your dataset as CSV or JSON directly from Apify. This supports Google Patents CSV export and broader Google Patents dataset download use cases.

Does it include patent PDFs?

Each record includes a pdfUrl pointing to the Google patent images storage. You can use this link to download PDFs externally as part of a Google Patents PDF downloader workflow.

Can it extract claims, citations, classifications, and family?

Yes. Toggle includeClaims, includeCitations, includeClassifications, and includePatentFamily to add these fields to your output. You can also enable includeFullText to add the long-form description.

What filters are available?

You can filter by assignee, inventor, country (office/region), dateFrom/dateTo (absolute or relative), and patentType (grant, application, design, or ANY). Inputs are OR-merged with keywords and publication numbers for broad coverage.

How does it avoid getting blocked?

The actor tries direct requests first. On block or failure, it automatically climbs a proxy ladder: datacenter proxies (e.g., SHADER) and then residential proxies with up to 3 retries. After a successful proxy response, it “sticks” to the working tier.

Closing thoughts

The Google Patents Scraper is built for structured, scalable Google Patents data extraction — from quick keyword pulls to large, enriched datasets. With smart query merging, optional deep fields, CSV/JSON exports, and Apify API access, it serves marketers, developers, data analysts, and researchers alike. Build pipelines, power dashboards, or kick off Google Patents text mining with a dependable Google Patents search results scraper. Start extracting smarter patent insights — at scale and on your terms.

Google Patents Scraper — Search, Citations, Family Graph

khadinakbar/google-patents-scraper

Scrape 120M+ patents from USPTO, EPO, WIPO, JPO, CN, KR + 100 offices. Six modes: search, details (claims/citations/family), byAssignee, byInventor, family graph, citationNetwork. Pay-per-event, no API key. Built for prior-art search, IP landscaping, and AI-agent use via Apify MCP.

Khadin Akbar

WIPO PatentScope Search — Global Prior-Art API

nexgendata/wipo-patentscope-search

Search WIPO PatentScope for global prior-art and PCT filings. Clean JSON for IP counsel, brand teams and AI agents.

NexGenData

Google Patents Scraper

api-empire/google-patents-scraper

🔎 Google Patents Scraper (google-patents-scraper) extracts structured patent data from Google Patents—titles, abstracts, inventors, assignees, CPC, claims, citations, priority dates & PDF links. ⚙️ Ideal for IP research, competitive intel & R&D. Export to CSV/JSON for analysis. 🚀

API Empire

Google Patents Scraper

scrapio/google-patents-scraper

🔎 Google Patents Scraper (google-patents-scraper) extracts titles, abstracts, claims, inventors, assignees, citations, IPC/CPC, dates, legal status & PDFs. 📦 Export CSV/JSON, API & batch ready. 🚀 Ideal for IP research, prior art search, patent analytics & competitive intelligence.

Scrapio

Google Patents Scraper

scraper-engine/google-patents-scraper

🔎 Google Patents Scraper extracts rich patent data from Google Patents—titles, abstracts, claims, inventors, assignees, CPC/IPC, citations, legal status, dates & PDFs. ⚙️ Export CSV/JSON. 🚀 Ideal for prior art, IP due diligence, competitive intel & tech scouting.

Scraper Engine

Google Patents Scraper

simpleapi/google-patents-scraper

🔎 Google Patents Scraper extracts structured patent data from Google Patents — titles, abstracts, inventors, assignees, dates, legal status, citations & CPC/IPC classifications. 📊 Export CSV/JSON. 🚀 Ideal for prior art, patent landscaping & competitive intelligence.

SimpleAPI

Google Patents Scraper

scrapeflow/google-patents-scraper

🔎 Google Patents Scraper extracts patent data from Google Patents—titles, inventors, assignees, abstracts & citations. 📊 Automate research, competitive analysis & prior-art discovery. ⚡ Fast, reliable, SEO-friendly.

ScrapeFlow

Google Patents Scraper

scrapelabsapi/google-patents-scraper

🔍 Google Patents Scraper automatically pulls patent data from Google Patents—titles, inventors, assignees, classifications, and more. 🚀 Perfect for IP research, market analysis, and competitive intelligence. 📊⚙️

ScrapeLabs

Google Patents Scraper - Patent Data, Claims & Citations

lulzasaur/google-patents-scraper

Scrape Google Patents for patent details, abstracts, claims, inventors, assignees, classifications, citations, similar patents, and PDF links. Search or provide patent URLs.