Unpaywall Scraper
Pricing
Pay per event
Unpaywall Scraper
Discover open access research articles with our powerful Unpaywall scraper! Search through millions of articles in the Unpaywall database to find free-to-read scholarly publications. Perfect for researchers, librarians, and academics who need to find and access open access articles efficiently.
Pricing
Pay per event
Rating
0.0
(0)
Developer

ParseForge
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
๐ Unpaywall Scraper
๐ Discover open access research articles with our powerful Unpaywall Scraper! Search through millions of articles in the Unpaywall database to find free to read scholarly publications. Get comprehensive open access status, bibliographic information, publisher details, citation data, and direct links to PDFs and landing pages. Perfect for researchers, librarians, and academics who need to find and access open access articles efficiently.
Target Audience: Researchers, academics, librarians, students, publishers, research institutions, open access advocates
Primary Use Cases: Finding free to read articles, open access research discovery, bibliographic database building, publication access analysis
๐ฏ What Does Unpaywall Scraper Do?
This tool searches the Unpaywall database to find scholarly publications by title. It delivers:
- Complete article details (title, DOI, year, journal name, publisher)
- Open access status (green, bronze, hybrid, or closed)
- Direct access links (PDF URLs, landing page URLs)
- Author information (names, ORCID IDs)
- Journal information (ISSN, DOAJ status, OA status)
- Best open access location details (version, license, host type)
- All available open access locations
- Citation data (cited by count, referenced works)
- Search relevance score and snippet
- And more
Business Value: Quickly find free to read versions of research articles, build open access article databases, analyze publication accessibility, and discover scholarly content without paywalls.
How to use the Unpaywall Scraper - Full Demo
[YouTube video embed or link]
Watch this 3-minute demo to see how easy it is to get started!
๐ฅ Input
To start Unpaywall web scraping, simply fill in the input form. You can search for articles based on:
- query - Search terms to find articles by title in the Unpaywall database (required). Example: 'software', 'machine learning', 'climate change', 'cancer research'
- is_oa - Filter to show only free to read (open access) articles. Leave unchecked to show all articles. Default: Show all articles
- maxItems - Maximum number of articles to collect (up to 1,000,000). Free users: Required, maximum 50. Paid users: Optional, maximum 1,000,000. Leave empty for unlimited (paid users only). Prefill: 10
Here's what the input configuration looks like in JSON:
{"query": "software","is_oa": true,"maxItems": 10}
Pro Tip: ๐ก Use the is_oa filter to find only free to read articles. This is perfect for researchers who need open access content without subscription barriers.
๐ Output
After the Actor finishes its run, you'll get a dataset with the output. Results are sent to the dataset in real-time as they are found, so you can see them appear while the scraper is running. The length of the dataset depends on the amount of results you've set. You can download those results as an Excel, HTML, XML, JSON, and CSV document.
Note: Articles without a valid DOI are automatically skipped to ensure data quality. All boolean fields (isOa, journalIsOa, journalIsInDoaj, hasRepositoryCopy) always return true or false, never undefined, ensuring consistent data structure.
Here's an example of scraped Unpaywall data you'll get if you decide to scrape article listings:
{"doi": "10.34190/ecgbl.16.1.481","doiUrl": "https://doi.org/10.34190/ecgbl.16.1.481","title": "Learning Machine Learning with a Game","year": 2022,"journalName": "European Conference on Games Based Learning","journalIssns": "2049-100X,2049-0992","journalIsOa": false,"journalIsInDoaj": false,"publisher": "Academic Conferences International Ltd","isOa": true,"oaStatus": "bronze","hasRepositoryCopy": false,"bestOaLocation": {"url": "https://papers.academic-conferences.org/index.php/ecgbl/article/download/481/704","urlForLandingPage": "https://doi.org/10.34190/ecgbl.16.1.481","urlForPdf": "https://papers.academic-conferences.org/index.php/ecgbl/article/download/481/704","version": "publishedVersion","license": null,"updated": "2023-07-04T02:24:38.731726","hostType": "publisher","evidence": "open (via free pdf)","isBest": true},"authors": [{"given": "Christoph","family": "Meinel","orcid": null,"authenticatedOrcid": false}],"citedByCount": 15,"citedByApiUrl": "https://api.openalex.org/works?filter=cites:W123456789","referencedWorks": ["W123456789","W987654321","W456789123"],"score": 0.09102392,"snippet": "Learning Machine Learning with a Game","scrapedTimestamp": "2025-12-02T22:15:49.459Z"}
What You Get:
- doi - Digital Object Identifier for the article
- doiUrl - Direct link to the article's DOI page
- title - Full article title
- year - Publication year
- journalName - Name of the journal or publication venue
- publisher - Publisher name
- isOa - Whether the article is open access (always true/false, never undefined)
- oaStatus - Open access status: green (repository), bronze (free on publisher site), hybrid (some versions free), or closed (not free)
- journalIsOa - Whether the journal is open access (always true/false, never undefined)
- journalIsInDoaj - Whether the journal is in DOAJ directory (always true/false, never undefined)
- hasRepositoryCopy - Whether a repository copy exists (always true/false, never undefined)
- bestOaLocation - Best available open access location with direct PDF and landing page URLs
- authors - List of authors with names and ORCID IDs
- citedByCount - Number of times this article has been cited
- citedByApiUrl - API URL to fetch articles that cite this one
- referencedWorks - Array of article IDs that this article cites
- score - Search relevance score (higher = better match) <<<<<<< HEAD
- snippet - Full article title (clean text, no HTML tags) - you can highlight query terms client-side if needed =======
- snippet - Clean text snippet of the article title (matches your query)
98f975f (Update unpaywall-scraper: improve code and documentation)
Download Options: CSV, Excel, or JSON formats for easy analysis
Why Choose the Unpaywall Scraper?
- Comprehensive Coverage: Access 120+ million scholarly articles from the largest open access database
- Time Savings: Find free to read articles in minutes instead of hours of manual searching
- Direct Access Links: Get immediate PDF and landing page URLs for every open access article
- Citation Data: Access citation counts and referenced works to understand article impact
- No Subscription Required: Find free to read articles without needing institutional access
- Complete Metadata: Get full bibliographic information including authors, journals, publishers, and more
Time Savings: Save 5-10 hours per week compared to manual article searching
Efficiency: Find relevant open access articles in minutes instead of hours
Data Quality: Complete article metadata with direct access links and citation information
๐ง How to Use
- Sign Up: Create a free account w/ $5 credit (takes 2 minutes)
- Find the Scraper: Visit the Unpaywall Scraper page
- Set Input: Enter your search query (e.g., "machine learning") and optionally check "Open Access Only" to filter for free articles
- Run It: Click "Start" and let it collect your data
- Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON
Total Time: 5 minutes setup, 10-30 minutes for data collection
No Technical Skills Required: Everything is point and click
Business Use Cases
Researchers & Academics:
- Find free to read versions of articles needed for research
- Build comprehensive bibliographic databases
- Track open access availability across research domains
- Discover scholarly content without subscription barriers
Librarians & Information Professionals:
- Build databases of open access articles available to institutions
- Support researchers with free to read article discovery
- Track open access trends and availability
- Generate reports on publication accessibility
Students & Graduate Researchers:
- Discover free academic resources without subscription barriers
- Find open access articles for thesis and research work
- Build citation databases with free to read articles
- Access scholarly content without institutional subscriptions
Publishers & Journals:
- Analyze open access trends and competitor publication accessibility
- Track open access status across research domains
- Monitor publication accessibility metrics
- Build publication databases with access information
Research Institutions:
- Track open access availability across research domains
- Support researchers with free to read article discovery
- Build institutional open access databases
- Analyze publication accessibility trends
Using Unpaywall Scraper with the Apify API
For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing business tools.
- Node.js: Install the apify-client NPM package
- Python: Use the apify-client PyPI package
- See the Apify API reference for full details
โ Frequently Asked Questions
Q: How does it work? A: Unpaywall Scraper is easy to use and requires no technical knowledge. Simply enter your search query and optionally filter for open access articles. The tool automatically searches the database and returns matching articles with complete metadata.
Q: How accurate is the data? A: We collect data from the Unpaywall database, which is continuously updated with information from millions of scholarly articles. The data includes real-time open access status and direct links to free to read versions.
Q: Can I filter for only open access articles?
A: Yes! Use the is_oa filter to show only free to read (open access) articles. This is perfect for researchers who need open access content without subscription barriers.
Q: What citation data is included? A: Each article includes citation count (how many times it's been cited) and referenced works (articles that this article cites). You also get an API URL to fetch articles that cite a specific work.
Q: Can I schedule regular runs? A: Yes! Use the Apify API to schedule daily, weekly, or monthly runs automatically. Perfect for ongoing article discovery and database updates.
Q: What if I need help? A: Our support team is available 24/7. Contact us through the Apify platform.
Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties.
Q: How many articles can I collect? A: Free users can collect up to 50 articles per run. Paid users can collect up to 1,000,000 articles per run, or leave it unlimited for maximum flexibility.
๐ Recommended Actors
Looking for more data collection tools? Check out these related actors:
| Actor | Description | Link |
|---|---|---|
| Crossref Scraper | Collects comprehensive academic publication data including citations and metadata from Crossref | https://apify.com/parseforge/crossref-scraper |
| GSA eLibrary Scraper | Collects government publication data from GSA eLibrary | https://apify.com/parseforge/gsa-elibrary-scraper |
| PR Newswire Scraper | Extracts press releases and news data from PR Newswire | https://apify.com/parseforge/pr-newswire-scraper |
| Hubspot Marketplace Scraper | Extracts business app data from HubSpot marketplace | https://apify.com/parseforge/hubspot-marketplace-scraper |
| Hugging Face Model Scraper | Collects machine learning model data from Hugging Face | https://apify.com/parseforge/hugging-face-model-scraper |
Pro Tip: ๐ก Browse our complete collection of data collection actors to find the perfect tool for your business needs.
Integrate Unpaywall Scraper with any app and automate your workflow
Last but not least, Unpaywall Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform.
These includes:
Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever Unpaywall Scraper successfully finishes a run.
Need Help? Our support team is here to help you get the most out of this tool.
โ ๏ธ Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Unpaywall or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.