HAL Open Science Scraper
Pricing
from $7.49 / 1,000 result items
HAL Open Science Scraper
Export research papers, theses, and preprints from HAL, France's national open science archive. 3M+ full-text records across every scientific discipline. Filter by domain, author, lab, journal, or year. Pull titles, abstracts, authors, DOIs, PDFs, citations.
Pricing
from $7.49 / 1,000 result items
Rating
0.0
(0)
Developer
ParseForge
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share

🚀 HAL Open Science Scraper
🚀 Export French open-access research from HAL. 3M+ papers and theses by domain, author, lab, journal.
🕒 Last updated: 2026-04-24 · 📊 11+ fields per record · 🔍 5 filters · 🚫 No auth required
Export research papers, theses, and preprints from HAL, Frances national open science archive. 3M+ full-text records across every scientific discipline. Filter by domain, author, lab, journal, or year.
Pull titles, abstracts, authors, DOIs, PDFs, citations.
📋 What the HAL Open Science Scraper does
- 🎯 Targeted filtering. Use the input schema to narrow results to what you need.
- 📦 Structured output. Clean, typed records with every field documented.
- 🔄 Live data. Every run fetches fresh data at runtime, no cached responses.
- 🔌 Easy integration. Consume via Apify API, webhooks, or direct dataset export.
- 📊 Scale on demand. Run once or run on a schedule, the same way.
💡 Why it matters: teams that rely on this source no longer need to babysit a custom crawler. Set up your filters once, get updated data on demand.
⚙️ Input
Send a JSON body with any of the documented input fields. All fields are optional unless the schema marks them required.
| Field | Type | Name | Description |
|---|---|---|---|
maxItems | integer | Max Items | Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000 |
query | string | Search Query | Free text query across titles, abstracts, and keywords. Supports wildcards (e.g. machine*). |
documentType | string | Document Type | Restrict to a specific document type. |
domain | string | Research Domain | Filter by HAL research domain code (e.g. 'info' for computer science, 'shs' for humanities, 'sdv' for life sciences). See hal.science/browse. |
year | integer | Publication Year | Restrict to a single publication year. |
openAccessOnly | boolean | Open Access Only | Restrict to records with open-access full text. |
⚠️ Good to Know: free users are limited to 10 items per run for preview purposes. Upgrade to Apify paid plans for higher limits.
📊 Output
The dataset returns one structured record per item. Each record includes identifiers, descriptive fields, and a link back to the source. Consume the dataset as JSON, CSV, Excel, XML, or RSS via the Apify console or API.
💼 Business use cases
🌟 Beyond business use cases
Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
🚀 How to use
- 📝 Create a free account. Sign up at console.apify.com to get $5 in credits.
- 🔍 Open the actor. Paste your filters into the input schema in the Apify console.
- ▶️ Click Start. Wait a few seconds for the first records to land.
- 📤 Export the data. Download JSON/CSV or pipe to webhooks, Google Sheets, or Zapier.
- 🔄 Schedule it. Apify Schedules let you rerun on a cron cadence for free.
⏱️ Total time to first data: about 60 seconds.
🤖 Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- 💬 ChatGPT
- 🧠 Claude
- 🔍 Perplexity
- 🅒 Copilot
❓ Frequently Asked Questions
🔌 Integrate with any app
Connect the HAL Open Science Scraper to cloud services via Apify integrations:
- Make - visual automation builder
- Zapier - 5000+ app connectors
- Google Sheets - pipe rows directly
- Airbyte - ingest into data warehouses
- Slack - receive run alerts
- HTTP webhooks - custom downstream
🔗 Recommended Actors
Pair the HAL Open Science Scraper with related actors:
- 🌐 Website Content Crawler - crawl any page at scale
- 🔍 Google Search Scraper - harvest SERPs
- 📄 Article Extractor - extract clean article text
- 📊 Google Trends Scraper - capture demand signals
- 📸 Screenshot URL - render any page to image
💡 Pro Tip: browse the complete ParseForge collection for more niche actors.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: This actor retrieves data from publicly available sources. You are responsible for complying with the source website's terms of service and applicable laws in your jurisdiction. ParseForge is not affiliated with the data source.