Project Gutenberg Books Scraper | 70K+ Free eBooks avatar

Project Gutenberg Books Scraper | 70K+ Free eBooks

Pricing

Pay per usage

Go to Apify Store
Project Gutenberg Books Scraper | 70K+ Free eBooks

Project Gutenberg Books Scraper | 70K+ Free eBooks

Export 70,000+ public-domain books from Project Gutenberg via the Gutendex API. Search by keyword, language, topic, or author lifespan, or fetch by book ID. Pull titles, authors, subjects, languages, download links, and full-text formats. Download as CSV, Excel, JSON, or XML.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

ParseForge Banner

📚 Project Gutenberg (Gutendex) Scraper

🚀 Export 70,000+ public-domain books with metadata and full-text download links in seconds.

🕒 Last updated: 2026-05-26 · 📊 10 fields per record · 70,000+ books · 60+ languages · 10+ formats per book

This Apify Actor extracts structured data from Project Gutenberg (Gutendex), returning clean JSON / CSV / Excel / XML datasets ready for analytics, integrations, or research workflows. Built by ParseForge for reliability and freshness.

🎯 Target Audience💡 Primary Use Cases
Data analysts, engineers, researchersAnalytics pipelines, BI dashboards, datasets
SaaS, fintech, marketing, ops teamsLead gen, enrichment, monitoring
Hobbyists, journalists, indie devsSide projects, content, exploration

📋 What the Project Gutenberg (Gutendex) Scraper does

  • Queries the public Project Gutenberg (Gutendex) API / feed and structures the response
  • Returns one record per item with 10 normalized fields
  • Supports filters configurable from the input schema
  • Outputs to CSV, Excel, JSON, XML via Apify dataset
  • Auto-limits to 10 items on the free plan; up to 1,000,000 on paid

💡 Why it matters: clean, ready-to-query data without manual scraping, parsing, or babysitting an API client.

🎬 Full Demo (🚧 Coming soon)

⚙️ Input

FieldTypeRequiredDescription
maxItemsintegerNoMax items to return (free: 10, paid: 1,000,000)
{ "maxItems": 50 }
{ "maxItems": 1000 }

⚠️ Good to Know: the free plan caps results at 10 items per run. Upgrade to a paid plan to unlock the full dataset.

📊 Output

FieldTypeDescription
🔹 idstringProject Gutenberg (Gutendex) id field
🔹 titlestringProject Gutenberg (Gutendex) title field
🔹 authorsstringProject Gutenberg (Gutendex) authors field
🔹 languagesstringProject Gutenberg (Gutendex) languages field
🔹 download_countstringProject Gutenberg (Gutendex) download_count field
🕒 scrapedAtstringISO timestamp of when the record was collected
❌ errorstringnull
{
"id": "...",
"title": "...",
"scrapedAt": "2026-05-26T00:00:00.000Z",
"error": null
}

✨ Why choose this Actor

DifferentiatorBenefit
🟢 Real-time public dataAlways fresh, never cached
🟢 Structured outputReady for BI, Excel, SQL imports
🟢 Pay-per-result pricingPay only for actual data collected
🟢 Apify-hosted runsNo servers, no maintenance
🟢 Free tier previewTest with 10 items before scaling

📈 How it compares to alternatives

MethodSetup timeReliabilityMaintenanceCost
Manual scriptingHoursMediumHighDev time
Generic web scrapersHoursLowHighVariable
This Actor30 secondsHighNonePay-per-result

🚀 How to use

  1. Create a free Apify account with $5 credit
  2. Open this Actor and click Try for free
  3. Configure the input (maxItems and any filters)
  4. Click Start
  5. Download the dataset as CSV / Excel / JSON / XML

💼 Business use cases

📊 Analytics & BI — feed Project Gutenberg (Gutendex) records into Looker, Tableau, Metabase for live dashboards.

🧠 Data enrichment — append Project Gutenberg (Gutendex) fields to your CRM / CDP records for better targeting.

🔍 Monitoring — schedule daily runs to detect new records, changes, or anomalies.

🤖 ML training data — use structured Project Gutenberg (Gutendex) data as features in models or test fixtures.

🔌 Automating Project Gutenberg (Gutendex) Scraper

Connect via Apify integrations: Make, Zapier, n8n, Slack, Discord, Airbyte, Google Drive, Google Sheets, GitHub, Webhook, REST API.

🌟 Beyond business use cases

🔬 Research — academics, journalists, policy researchers building public datasets.

🎨 Personal — hobbyists tracking favorite topics, building personal projects, exploring data.

🤝 Non-profit — NGOs, civic-tech, open-data initiatives needing structured Project Gutenberg (Gutendex) extracts.

🧪 Experimentation — students, builders, indie devs prototyping with real-world data.

🤖 Ask an AI assistant about this scraper

Drop a link to this page into ChatGPT, Claude, Perplexity, or Copilot and ask: "What can I do with the Project Gutenberg (Gutendex) Scraper?"

❓ Frequently Asked Questions

❓ Is the data fresh? Yes. Every run queries Project Gutenberg (Gutendex) in real time. No cached responses.

💰 How does pricing work? Pay-per-result: charged once per record collected. Free plan: 10 items per run preview.

🧾 What format is the output? CSV, Excel, JSON, XML — Apify dataset supports all four.

🔐 Do I need an API key? No. Project Gutenberg (Gutendex) public data only, no login required.

📈 What's the max I can scrape? 1,000,000 items per run on paid plans.

⏱️ How long does a run take? Typically seconds to minutes depending on maxItems.

🌍 Is this affiliated with Project Gutenberg (Gutendex)? No. Independent tool. Only publicly accessible data collected.

🛠️ What if a run fails? You're never charged for failed records. Errors appear as {error: "..."} rows.

🤝 Can I use this commercially? Yes, under Apify's standard terms and Project Gutenberg (Gutendex)'s public-data policies.

📞 Where do I get help? Open our contact form.

🔌 Integrate with any app

Apify's full integration list: Make · Zapier · n8n · Slack · Discord · Webhook · REST API · GitHub · Airbyte · Google Drive · Google Sheets · Gmail · HubSpot · Pipedream.

ActorDescription
Wikipedia On This Day ScraperDaily Wikipedia historical events
Public Holidays ScraperHolidays for 100+ countries
SPDX Software Licenses ScraperOpen-source license metadata
ISO Country Codes ScraperIBAN + ISO country codes

💡 Pro Tip: browse the complete ParseForge collection.

🆘 Need Help? Open our contact form

⚠️ Disclaimer: independent tool, not affiliated with Project Gutenberg (Gutendex). Only publicly available data collected.