Project Gutenberg Scraper
Pricing
from $10.00 / 1,000 results
Project Gutenberg Scraper
Scrape Project Gutenberg (gutenberg.org). Search 70K+ free public domain ebooks. Extract titles, authors, subjects, download formats (EPUB, Kindle, TXT, HTML), and full metadata.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
lulz bot
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 hours ago
Last modified
Categories
Share
Scrape the Project Gutenberg free eBook catalog. Search 70,000+ public domain books by title, author, topic, or language. Get complete metadata, subjects, bookshelves, and download links for every format (EPUB, HTML, plain text, Kindle).
Features
- Search by title/author: Find books by any keyword
- Filter by topic: Browse by subject like "science fiction", "philosophy", "children"
- Filter by language: English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese
- Full metadata: Authors with birth/death years, subjects, bookshelves, download counts
- Download links: Direct URLs for EPUB, HTML, plain text, Kindle, and cover images
- Pagination: Automatically follows paginated results up to your limit
Output Fields
| Field | Description |
|---|---|
id | Gutenberg book ID |
title | Book title |
authors | Array of authors with name, birthYear, deathYear |
subjects | Array of Library of Congress subjects |
bookshelves | Array of Gutenberg bookshelves |
languages | Array of language codes (e.g. "en", "fr") |
downloadCount | Total download count |
formats | Object with epub, html, txt, kindle, coverImage URLs |
copyright | Boolean copyright status |
mediaType | Media type (usually "Text") |
scrapedAt | ISO timestamp |
Input Options
- Search Query: Search by title or author name
- Topic: Filter by subject/bookshelf
- Language: Filter by language
- Max Results: Limit number of books (default 50, max 5000)
Use Cases
- Digital library building: Bulk download public domain books
- Literary research: Analyze authors, subjects, and popularity trends
- NLP/AI training: Gather text corpora by language or topic
- Education: Find free reading materials by subject area
- Data journalism: Analyze most popular public domain works
Example Output
{"id": 1342,"title": "Pride and Prejudice","authors": [{"name": "Austen, Jane", "birthYear": 1775, "deathYear": 1817}],"subjects": ["Courtship -- Fiction", "England -- Fiction", "Sisters -- Fiction"],"bookshelves": ["Best Books Ever Listings"],"languages": ["en"],"downloadCount": 75892,"formats": {"epub": "https://www.gutenberg.org/ebooks/1342.epub3.images","html": "https://www.gutenberg.org/files/1342/1342-h/1342-h.htm","txt": "https://www.gutenberg.org/files/1342/1342-0.txt"},"copyright": false,"scrapedAt": "2026-04-26T12:00:00.000Z"}
Run on Apify
This scraper runs on the Apify platform -- a full-stack web scraping and automation cloud. Sign up for a free account to get started with 30-day trial of all features.