Unpaywall Scraper avatar
Unpaywall Scraper

Pricing

Pay per event

Go to Apify Store
Unpaywall Scraper

Unpaywall Scraper

Discover open access research articles with our powerful Unpaywall scraper! Search through millions of articles in the Unpaywall database to find free-to-read scholarly publications. Perfect for researchers, librarians, and academics who need to find and access open access articles efficiently.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

๐Ÿ”“ Unpaywall Scraper

๐Ÿš€ Discover open access research articles with our powerful Unpaywall Scraper! Search through millions of articles in the Unpaywall database to find free to read scholarly publications. Get comprehensive open access status, bibliographic information, publisher details, citation data, and direct links to PDFs and landing pages. Perfect for researchers, librarians, and academics who need to find and access open access articles efficiently.

Target Audience: Researchers, academics, librarians, students, publishers, research institutions, open access advocates
Primary Use Cases: Finding free to read articles, open access research discovery, bibliographic database building, publication access analysis

๐ŸŽฏ What Does Unpaywall Scraper Do?

This tool searches the Unpaywall database to find scholarly publications by title. It delivers:

  • Complete article details (title, DOI, year, journal name, publisher)
  • Open access status (green, bronze, hybrid, or closed)
  • Direct access links (PDF URLs, landing page URLs)
  • Author information (names, ORCID IDs)
  • Journal information (ISSN, DOAJ status, OA status)
  • Best open access location details (version, license, host type)
  • All available open access locations
  • Citation data (cited by count, referenced works)
  • Search relevance score and snippet
  • And more

Business Value: Quickly find free to read versions of research articles, build open access article databases, analyze publication accessibility, and discover scholarly content without paywalls.

How to use the Unpaywall Scraper - Full Demo

[YouTube video embed or link]

Watch this 3-minute demo to see how easy it is to get started!

๐Ÿ“ฅ Input

To start Unpaywall web scraping, simply fill in the input form. You can search for articles based on:

  • query - Search terms to find articles by title in the Unpaywall database (required). Example: 'software', 'machine learning', 'climate change', 'cancer research'
  • is_oa - Filter to show only free to read (open access) articles. Leave unchecked to show all articles. Default: Show all articles
  • maxItems - Maximum number of articles to collect (up to 1,000,000). Free users: Required, maximum 50. Paid users: Optional, maximum 1,000,000. Leave empty for unlimited (paid users only). Prefill: 10

Here's what the input configuration looks like in JSON:

{
"query": "software",
"is_oa": true,
"maxItems": 10
}

Pro Tip: ๐Ÿ’ก Use the is_oa filter to find only free to read articles. This is perfect for researchers who need open access content without subscription barriers.

๐Ÿ“Š Output

After the Actor finishes its run, you'll get a dataset with the output. Results are sent to the dataset in real-time as they are found, so you can see them appear while the scraper is running. The length of the dataset depends on the amount of results you've set. You can download those results as an Excel, HTML, XML, JSON, and CSV document.

Note: Articles without a valid DOI are automatically skipped to ensure data quality. All boolean fields (isOa, journalIsOa, journalIsInDoaj, hasRepositoryCopy) always return true or false, never undefined, ensuring consistent data structure.

Here's an example of scraped Unpaywall data you'll get if you decide to scrape article listings:

{
"doi": "10.34190/ecgbl.16.1.481",
"doiUrl": "https://doi.org/10.34190/ecgbl.16.1.481",
"title": "Learning Machine Learning with a Game",
"year": 2022,
"journalName": "European Conference on Games Based Learning",
"journalIssns": "2049-100X,2049-0992",
"journalIsOa": false,
"journalIsInDoaj": false,
"publisher": "Academic Conferences International Ltd",
"isOa": true,
"oaStatus": "bronze",
"hasRepositoryCopy": false,
"bestOaLocation": {
"url": "https://papers.academic-conferences.org/index.php/ecgbl/article/download/481/704",
"urlForLandingPage": "https://doi.org/10.34190/ecgbl.16.1.481",
"urlForPdf": "https://papers.academic-conferences.org/index.php/ecgbl/article/download/481/704",
"version": "publishedVersion",
"license": null,
"updated": "2023-07-04T02:24:38.731726",
"hostType": "publisher",
"evidence": "open (via free pdf)",
"isBest": true
},
"authors": [
{
"given": "Christoph",
"family": "Meinel",
"orcid": null,
"authenticatedOrcid": false
}
],
"citedByCount": 15,
"citedByApiUrl": "https://api.openalex.org/works?filter=cites:W123456789",
"referencedWorks": [
"W123456789",
"W987654321",
"W456789123"
],
"score": 0.09102392,
"snippet": "Learning Machine Learning with a Game",
"scrapedTimestamp": "2025-12-02T22:15:49.459Z"
}

What You Get:

  • doi - Digital Object Identifier for the article
  • doiUrl - Direct link to the article's DOI page
  • title - Full article title
  • year - Publication year
  • journalName - Name of the journal or publication venue
  • publisher - Publisher name
  • isOa - Whether the article is open access (always true/false, never undefined)
  • oaStatus - Open access status: green (repository), bronze (free on publisher site), hybrid (some versions free), or closed (not free)
  • journalIsOa - Whether the journal is open access (always true/false, never undefined)
  • journalIsInDoaj - Whether the journal is in DOAJ directory (always true/false, never undefined)
  • hasRepositoryCopy - Whether a repository copy exists (always true/false, never undefined)
  • bestOaLocation - Best available open access location with direct PDF and landing page URLs
  • authors - List of authors with names and ORCID IDs
  • citedByCount - Number of times this article has been cited
  • citedByApiUrl - API URL to fetch articles that cite this one
  • referencedWorks - Array of article IDs that this article cites
  • score - Search relevance score (higher = better match) <<<<<<< HEAD
  • snippet - Full article title (clean text, no HTML tags) - you can highlight query terms client-side if needed =======
  • snippet - Clean text snippet of the article title (matches your query)

98f975f (Update unpaywall-scraper: improve code and documentation)

Download Options: CSV, Excel, or JSON formats for easy analysis

Why Choose the Unpaywall Scraper?

  • Comprehensive Coverage: Access 120+ million scholarly articles from the largest open access database
  • Time Savings: Find free to read articles in minutes instead of hours of manual searching
  • Direct Access Links: Get immediate PDF and landing page URLs for every open access article
  • Citation Data: Access citation counts and referenced works to understand article impact
  • No Subscription Required: Find free to read articles without needing institutional access
  • Complete Metadata: Get full bibliographic information including authors, journals, publishers, and more

Time Savings: Save 5-10 hours per week compared to manual article searching
Efficiency: Find relevant open access articles in minutes instead of hours
Data Quality: Complete article metadata with direct access links and citation information

๐Ÿ”ง How to Use

  1. Sign Up: Create a free account w/ $5 credit (takes 2 minutes)
  2. Find the Scraper: Visit the Unpaywall Scraper page
  3. Set Input: Enter your search query (e.g., "machine learning") and optionally check "Open Access Only" to filter for free articles
  4. Run It: Click "Start" and let it collect your data
  5. Download Data: Get your results in the "Dataset" tab as CSV, Excel, or JSON

Total Time: 5 minutes setup, 10-30 minutes for data collection
No Technical Skills Required: Everything is point and click

Business Use Cases

Researchers & Academics:

  • Find free to read versions of articles needed for research
  • Build comprehensive bibliographic databases
  • Track open access availability across research domains
  • Discover scholarly content without subscription barriers

Librarians & Information Professionals:

  • Build databases of open access articles available to institutions
  • Support researchers with free to read article discovery
  • Track open access trends and availability
  • Generate reports on publication accessibility

Students & Graduate Researchers:

  • Discover free academic resources without subscription barriers
  • Find open access articles for thesis and research work
  • Build citation databases with free to read articles
  • Access scholarly content without institutional subscriptions

Publishers & Journals:

  • Analyze open access trends and competitor publication accessibility
  • Track open access status across research domains
  • Monitor publication accessibility metrics
  • Build publication databases with access information

Research Institutions:

  • Track open access availability across research domains
  • Support researchers with free to read article discovery
  • Build institutional open access databases
  • Analyze publication accessibility trends

Using Unpaywall Scraper with the Apify API

For advanced users who want to automate this process, you can control the scraper programmatically with the Apify API. This allows you to schedule regular data collection and integrate with your existing business tools.

  • Node.js: Install the apify-client NPM package
  • Python: Use the apify-client PyPI package
  • See the Apify API reference for full details

โ“ Frequently Asked Questions

Q: How does it work? A: Unpaywall Scraper is easy to use and requires no technical knowledge. Simply enter your search query and optionally filter for open access articles. The tool automatically searches the database and returns matching articles with complete metadata.

Q: How accurate is the data? A: We collect data from the Unpaywall database, which is continuously updated with information from millions of scholarly articles. The data includes real-time open access status and direct links to free to read versions.

Q: Can I filter for only open access articles? A: Yes! Use the is_oa filter to show only free to read (open access) articles. This is perfect for researchers who need open access content without subscription barriers.

Q: What citation data is included? A: Each article includes citation count (how many times it's been cited) and referenced works (articles that this article cites). You also get an API URL to fetch articles that cite a specific work.

Q: Can I schedule regular runs? A: Yes! Use the Apify API to schedule daily, weekly, or monthly runs automatically. Perfect for ongoing article discovery and database updates.

Q: What if I need help? A: Our support team is available 24/7. Contact us through the Apify platform.

Q: Is my data secure? A: Absolutely. All data is encrypted in transit and at rest. We never share your data with third parties.

Q: How many articles can I collect? A: Free users can collect up to 50 articles per run. Paid users can collect up to 1,000,000 articles per run, or leave it unlimited for maximum flexibility.

Looking for more data collection tools? Check out these related actors:

ActorDescriptionLink
Crossref ScraperCollects comprehensive academic publication data including citations and metadata from Crossrefhttps://apify.com/parseforge/crossref-scraper
GSA eLibrary ScraperCollects government publication data from GSA eLibraryhttps://apify.com/parseforge/gsa-elibrary-scraper
PR Newswire ScraperExtracts press releases and news data from PR Newswirehttps://apify.com/parseforge/pr-newswire-scraper
Hubspot Marketplace ScraperExtracts business app data from HubSpot marketplacehttps://apify.com/parseforge/hubspot-marketplace-scraper
Hugging Face Model ScraperCollects machine learning model data from Hugging Facehttps://apify.com/parseforge/hugging-face-model-scraper

Pro Tip: ๐Ÿ’ก Browse our complete collection of data collection actors to find the perfect tool for your business needs.

Integrate Unpaywall Scraper with any app and automate your workflow

Last but not least, Unpaywall Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform.

These includes:

Alternatively, you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever Unpaywall Scraper successfully finishes a run.

Need Help? Our support team is here to help you get the most out of this tool.


โš ๏ธ Disclaimer: This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Unpaywall or any of its subsidiaries. All trademarks mentioned are the property of their respective owners.