Pricing

from $13.00 / 1,000 result items

Project Gutenberg Books Scraper

Search 75,000+ free public-domain books from Project Gutenberg. Returns title, author with birth/death years, cover image, plain-text and EPUB download URLs, Kindle and HTML formats, subjects, bookshelves, language, copyright status, summaries and download counts. Filter by author or language.

Pricing

from $13.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

📚 Project Gutenberg Books Scraper

🚀 Search 75,000+ free public-domain books from Project Gutenberg.

The Project Gutenberg Books Scraper searches the Project Gutenberg catalog and returns structured records for any free public-domain ebook. Output includes title, author with birth/death years, cover image, plain-text and EPUB download URLs, Kindle and HTML formats, subjects, bookshelves, language, copyright status, summaries, and download counts.

Project Gutenberg has been digitizing public-domain texts since 1971 and now hosts 75,000+ books across 60+ languages. Filters run server-side, so a single run can isolate every Shakespeare play, all 19th-century French novels, or the most-downloaded books of all time.

🎯 Target Audience	💡 Primary Use Cases
Researchers, NLP/ML teams, librarians, educators, content creators, ebook app developers	Building text corpora, NLP training datasets, public-domain ebook libraries, literary research, citation generation

📋 What the Project Gutenberg Books Scraper does

Five filtering workflows in a single run:

🔍 Free-text search. Match by title, author, or general keywords.
👤 Author filter. Restrict to one author across all their works.
🏷️ Topic filter. Filter by subject (history, philosophy, science, fiction).
🌐 Language filter. ISO 639 language codes (en, fr, de, es, zh, ja).
📅 Author year filter. Filter authors by birth/death year for period studies.

💡 Why it matters: clean, server-side filtering removes the parser-and-pagination work from your team and keeps your dataset fresh on every run.

📊 Data fields

Each record includes: authors, copyright, coverUrl, downloadCount, gutenbergId, gutenbergUrl, languages, mediaType, subjectCount, title. These field names come straight from the actor's dataset schema, so what you see here is what lands in your dataset.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Project Gutenberg Books Scraper page on the Apify Store.
🎯 Set input. Pick your filters and maxItems.
🚀 Run it. Click Start and let the Actor collect your data.
📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.

🔗 Recommended Actors

📖 Open Library Books - 30M+ books and editions
🌐 Wikidata Entity Search - 100M+ open knowledge-graph entities
🎨 Openverse Media - 800M+ openly licensed images and audio
🎓 arXiv Scraper - Academic preprints
🎬 TVMaze TV Shows - TV show metadata

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Project Gutenberg, the Gutendex project, or any contributing volunteers. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

🆘 Need Help?

If you hit a bug, have questions about setup, or need a scraper we haven't built yet, open our contact form or write to parseforge@protonmail.com. We also take on paid custom data projects.

For faster answers, join our Discord. It's the best place to get support and suggest new actors.

Gutenberg Scraper

velvety_bedbug/gutenberg-scraper

Scrape free public domain books from Project Gutenberg via the Gutendex API. Search by title/author, filter by topic, language, or author birth/death year. Returns book metadata and download URLs for text, HTML, and EPUB formats. 78,000+ books. Free, no auth required.

Peters Bugs

Project Gutenberg Scraper

crawlerbros/project-gutenberg-scraper

Search and download Project Gutenberg's 75,000+ free ebooks. Filter by keyword, topic, language, author era, copyright status, and available format (EPUB, Kindle, PDF, plain text).

Crawler Bros

Project Gutenberg Books Scraper

gio21/gutenberg-books-scraper

Scrape public-domain books from Project Gutenberg via the Gutendex API. Filter by topic, author, language, search query. Returns title, authors, languages, copyright, download_count, formats (EPUB, MOBI, TXT, HTML), subjects, bookshelves. Pay per book returned.

Gio

Gutenberg Books Scraper

fortuitous_pirate/gutenberg-books-scraper

Scrape book metadata from Project Gutenberg: 70,000+ free public domain ebooks. Search by title, author, topic, or language. Returns authors, subjects, formats, and download links.

Fortuitous Pirate

Project Gutenberg Scraper

lulzasaur/gutenberg-scraper

Scrape Project Gutenberg (gutenberg.org). Search 70K+ free public domain ebooks. Extract titles, authors, subjects, download formats (EPUB, Kindle, TXT, HTML), and full metadata.

lulz bot

Gutendex Books Scraper - Gutenberg Metadata

benthepythondev/gutendex-books-scraper

Search Project Gutenberg books and export ID, title, authors, subjects, languages, copyright status, downloads, formats and links.

Ben

Project Gutenberg Books Scraper | 70K+ Free eBooks

parseforge/gutendex-project-gutenberg-books-scraper

Export 70,000+ public-domain books from Project Gutenberg via the Gutendex API. Search by keyword, language, topic, or author lifespan, or fetch by book ID. Pull titles, authors, subjects, languages, download links, and full-text formats. Download as CSV, Excel, JSON, or XML.

ParseForge

Project Gutenberg Top Books Scraper

rambunctious_fingerprint/project-gutenberg-scraper

Casey Marsh

Project Gutenberg Ebook Scraper (Gutendex)

jungle_synthesizer/gutenberg-gutendex-public-domain-ebook-scraper

Scrape the full Project Gutenberg catalog via the Gutendex JSON API. Filter by search, language, subject, author era, and download count. Returns EPUB, Kindle, plain-text, and HTML download URLs — built for AI training corpora, NLP datasets, and TTS pipelines.

BowTiedRaccoon

Project Gutenberg Research Scraper

happyfhantum/project-gutenberg-research-scraper

Exhaustively searches Project Gutenberg's 70,000+ free ebooks using multi-page pagination and smart filtering. Perfect for academic research, finding complete author works, or discovering books on specialized topics. Gets all results, not just the first page.