Google Patents Scraper — Claims, Inventors & Citations
Pricing
$8.99/month + usage
Google Patents Scraper — Claims, Inventors & Citations
Scrape complete patent data from Google Patents. Returns full claims text, abstract, inventors, applicants, IPC/CPC classifications, filing dates, legal status and citation counts. Search by patent ID, keyword, assignee or inventor. RESIDENTIAL proxy included.
Pricing
$8.99/month + usage
Rating
0.0
(0)
Developer
Scrape Pilot
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
📜 Google Patents Scraper v1 — Full Patent Data & Search
Extract complete patent data from Google Patents — titles, abstracts, claims, inventors, applicants, legal status, classifications, citations, and more. No API key required. Supports both single‑patent lookup and bulk search.
💡 What is Google Patents Scraper?
Google Patents Scraper is a professional‑grade automation tool that extracts real, publicly available patent data from patents.google.com. Whether you need a single patent’s full specification or want to search by keyword, assignee, inventor, or country, this actor delivers structured JSON output – ready for analytics, dashboards, or IP research.
No API key, no login, no hidden fees. Just provide a patent ID or search query, and get back:
- Patent ID, title, abstract, claims, description
- Inventors, applicants, assignee, agent
- Filing, publication, and priority dates
- Legal status (active, pending, expired)
- IPC / CPC classifications
- Cited patents and citations count
- Patent family members
- Country of origin
📦 What Data Can You Extract?
| 🧩 Data Type | 📋 Description |
|---|---|
| 🆔 Patent ID | e.g., US10000000B2, EP1234567A1 |
| 📝 Title | Full patent title |
| 📄 Abstract | Short technical summary |
| ⚖️ Claims | Full claims text + claim count |
| 📖 Description | Detailed specification (when available) |
| 👥 Inventors | List of inventor names |
| 🏢 Applicants / Assignee | Legal applicant(s) and primary assignee |
| 📅 Dates | Filing date, publication date, priority date |
| ⚖️ Legal Status | Active, pending, expired, etc. |
| 🏷️ Classifications | IPC and CPC codes |
| 🔗 Citations | Cited patents count and citations count |
| 👨💼 Agent | Patent attorney/agent (if listed) |
| 🌍 Country | Patent‑issuing country (US, EP, WO, etc.) |
| 👨👩👧👦 Patent Family | Related patents in the same family |
| 🔗 Source URL | Direct link to Google Patents page |
All data is returned as clean JSON with consistent field names.
⚙️ Key Features
- Two Modes – Fetch by specific patent ID(s) OR search by keyword/assignee/inventor/country.
- Full‑Text Extraction – Retrieves claims, abstract, description (up to thousands of characters).
- Structured Classifications – IPC and CPC codes as arrays.
- Citation Metrics – Count of patents cited by this patent and number of times it has been cited.
- Patent Family – List of related patents from the same family.
- Residential Proxy Required – Google Patents blocks datacenter IPs. The actor is designed to work with Apify residential proxies.
- Bulk Support – Provide a list of patent IDs (one per line or JSON array) and scrape them all.
- Demo Data in Search – When search mode is used with
fetch_full_details=true, the actor automatically fetches the complete detail page for each result. - Clean Error Handling – Failed patents are marked with
Extraction_Status: Failed ❌without breaking the run.
📥 Input Parameters
The actor accepts a JSON object with the following fields:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
patent_ids | array or string | No | – | List of patent IDs (e.g., ["US10000000B2", "EP1234567A1"]). Can be newline‑separated string. |
search_query | string | No | – | Keyword or phrase (e.g., "quantum computing"). |
assignee | string | No | – | Filter by assignee name (e.g., "Raytheon"). |
inventor | string | No | – | Filter by inventor name. |
country | string | No | – | Two‑letter country code (e.g., US, EP, WO). |
date_from | string | No | – | Earliest publication date (YYYY-MM-DD). |
max_results | integer | No | 20 | Maximum number of patents to return (per mode). |
fetch_full_details | boolean | No | true | For search mode, fetch the full detail page of each result. |
proxyConfiguration | object | No | – | Apify proxy configuration. Residential proxies are required. |
Note: You must provide either
patent_idsor at least one search parameter (search_query,assignee,inventor).
Example Input (Patent IDs)
{"patent_ids": "US10000000B2\nUS9876543B1","fetch_full_details": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Example Input (Search)
{"search_query": "machine learning","assignee": "Google","country": "US","max_results": 10,"fetch_full_details": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
📤 Output Format
Each patent is returned as a JSON object. Fields may be omitted if not available on the page.
Common Fields (All Patents)
| Field | Type | Description |
|---|---|---|
Patent_ID | string | Normalized patent ID (e.g., US10000000B2). |
Source_URL | string | Direct link to Google Patents page. |
Extraction_Status | string | "Verified ✅" or "Failed ❌". |
Data_Grade | string | Always "ENTERPRISE" for success. |
Timestamp | string | ISO 8601 extraction timestamp. |
Title | string | Full patent title. |
Abstract | string | Abstract text. |
Claims | string | Full claims text (may be truncated). |
Claims_Count | integer | Number of claims (if parsed). |
Description | string | Detailed description (when available). |
Publication_Date | string | YYYY-MM-DD. |
Filing_Date | string | YYYY-MM-DD. |
Priority_Date | string | YYYY-MM-DD. |
Inventors | array[string] | List of inventor names. |
Applicants | array[string] | List of applicants/assignees. |
Assignee | string | Primary assignee. |
Agent | string | Patent attorney/agent. |
Legal_Status | string | e.g., "Active", "Pending". |
IPC_Classification | array[string] | International Patent Classification codes. |
CPC_Classification | array[string] | Cooperative Patent Classification codes. |
Cited_Patents_Count | integer | Number of patents cited by this patent. |
Cited_By_Count | integer | Number of times this patent has been cited. |
Patent_Family | array[string] | Related patent IDs in the same family. |
Patent_Family_Count | integer | Size of the patent family. |
Country_Code | string | Two‑letter code (e.g., US). |
Country | string | Full country name. |
Example Output (Successful)
[{"Patent_ID": "US10000000B2","Source_URL": "https://patents.google.com/patent/US10000000B2","Extraction_Status": "Verified ✅","Data_Grade": "ENTERPRISE","Timestamp": "2026-04-06T14:51:18.480881Z","Title": "US10000000B2 - Coherent LADAR using intra-pixel quadrature detection - Google Patents","Abstract": "A frequency modulated (coherent) laser detection and ranging system includes a read-out integrated circuit...","Claims": "Claims (20) What is claimed is: 1. A laser detection and ranging (LADAR) system...","Claims_Count": null,"Publication_Date": "2018-06-19","Filing_Date": "2015-03-10","Priority_Date": "2015-03-10","Inventors": ["Joseph Marron"],"Applicants": ["Raytheon Co"],"Assignee": "Raytheon Co","Legal_Status": "Active","Cited_Patents_Count": 0,"Cited_By_Count": 0,"Country_Code": "US","Country": "United States"}]
Example Output (Failed)
[{"Patent_ID": "US9876543B1","Extraction_Status": "Failed ❌","Source_URL": "https://patents.google.com/patent/US9876543B1","Timestamp": "2026-04-06T14:51:22.874258Z"}]
🛠 How to Use on Apify
- Create a task with this actor.
- Choose mode – either provide
patent_idsor fill in search parameters (search_query,assignee, etc.). - Configure proxies – must enable residential proxies (Google Patents blocks datacenter IPs).
- Set max results – limit the number of patents to fetch.
- Run – the actor will scrape data and push it to the Dataset.
- Export – download results as JSON, CSV, or Excel.
Running via API
curl -X POST "https://api.apify.com/v2/acts/your-username~google-patents-scraper/runs" \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_API_TOKEN" \-d '{"patent_ids": ["US10000000B2", "EP1234567A1"],"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}'
🎯 Use Cases
- Patent Attorneys & IP Firms – Quickly retrieve full patent specifications for prior art searches.
- R&D Departments – Monitor competitors’ patents and identify white spaces.
- Tech Scouts – Discover new inventions in emerging fields (AI, biotech, clean energy).
- Investment Analysts – Evaluate a company’s patent portfolio strength.
- Academic Research – Build datasets for patent citation analysis or innovation studies.
- Product Development – Avoid infringement by reviewing existing patents.
❓ Frequently Asked Questions
Q1. Do I need a Google Patents API key?
No. This actor uses the public web interface of patents.google.com. No API key or registration required.
Q2. Why do I need residential proxies?
Google Patents blocks most datacenter IP addresses (AWS, Google Cloud, etc.). Residential proxies mimic real users and are essential to avoid 403 errors.
Q3. How accurate are the extracted claims and descriptions?
The actor extracts exactly what is displayed on the public Google Patents page. Claims are usually complete; descriptions may be truncated if very long (max 3000 characters).
Q4. Can I search by CPC or IPC classification?
Yes. You can include classification codes in your search_query (e.g., cpc:G06N20/00). Google Patents supports advanced search syntax.
Q5. What happens if a patent ID is invalid or not found?
The actor returns an object with Extraction_Status: "Failed ❌" and continues with the next ID.
Q6. How many patents can I scrape in one run?
There is no hard limit, but we recommend keeping max_results under 100 to avoid long run times. For bulk scraping, use multiple runs or increase delays.
Q7. Does the actor support downloading PDFs?
No. This actor extracts only structured metadata. For PDF download, you can use the Source_URL to access the official PDF from Google Patents.
Q8. What countries are supported?
Google Patents covers patents from the US (US), Europe (EP), World (WO), China (CN), Japan (JP), Korea (KR), Germany (DE), and many others. The actor returns the Country_Code and Country fields.
🔍 SEO Keywords
Google Patents scraper, patent data extraction, USPTO patent search, patent claims extractor, IPC classification, CPC classification, patent citation analysis, prior art search, intellectual property scraper, Apify patent actor, patent family, patent legal status
🔗 Related Actors
You might also find these useful:
- Grant & Foundation Opportunities Scraper – Direct extraction from official USPTO database.
- [GitHub Scraper: Extract Trending Repos, Stars, Forks & Leads(https://apify.com/scrapepilot/github-scraper-extract-trending-repos-stars-forks-leads) – Build citation networks from patent data.