Pricing

from $3.50 / 1,000 results

Scribd Document Search Scraper

[💰 $3.5 / 1K] Search Scribd by keyword and export structured metadata for every matching document, book, audiobook, sheet music, or podcast — title, author, type, page count, ratings, views, language, categories, and links.

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

SolidCode

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Why This Scraper?

Multi-keyword batch in one run — pass a list of search terms and each keyword runs as its own search, so you cover an entire topic map in a single pass instead of one query at a time.
Up to 10,000 results per keyword — lifts the typical 100-result search ceiling 100× so you can sweep a whole topic, not just the first page.
23 structured fields per document — id, title, author, type, description, page count, release date, views, reading time, ratings, language, categories, and direct links — every row consumption-ready, no raw markup.
Derived 0–5 star rating — a clean star score computed from each document's community upvotes and downvotes, alongside the raw upvoteCount, downvoteCount, and ratingCount.
Author profile URLs for outreach — every row carries the primary author's name and absolute profile link, plus a full authors array with each contributor's id, name, and profile URL.
Engagement signals built in — real view counts (parsed from Scribd's "15K"/"1.2M" shorthand to plain integers) and estimated reading time let you rank documents by popularity, not just relevance.
Direct reader and download links — every row includes the canonical reader URL and a ready-to-use download link when the document is downloadable, so you never have to reconstruct paths.
Result-language preference across 21 languages — bias results toward English, Spanish, Portuguese, French, German, Arabic, Hindi, Japanese, and more so you collect documents in the language your audience reads.
Query provenance on every row — each document carries the exact keyword that surfaced it, so a single mixed dataset stays attributable per search term.

Use Cases

Market & Content Research

Map how much Scribd content exists around a topic, product, or industry
Surface the most-viewed and highest-rated documents in a niche
Track templates, whitepapers, and guides circulating in your space
Build topic libraries spanning dozens of keywords in one run

Competitive Analysis

See which authors and brands publish the most in your category
Benchmark engagement (views, ratings) against competing documents
Monitor new uploads tied to a brand or product name
Compare document depth by page count across competitors

Lead Generation & Outreach

Collect author profile URLs and names for creator outreach
Identify prolific publishers in a target vertical
Build prospect lists from documents matching buyer-intent keywords
Prioritize outreach by author reach using view counts and ratings

Academic & Reference

Gather reference document metadata across many search terms at once
Filter your reading list by page count and reading time before opening anything
Prefer results in a specific language for non-English literature reviews
Catalog community ratings to triage which documents are worth reading

Content Curation

Power recommendation feeds and resource roundups with fresh metadata
Enrich an existing content database with views, ratings, and categories
Curate by category labels Scribd files each document under
Feed a newsletter or knowledge base with structured document records

Getting Started

Simple — one keyword

{
    "queries": ["business plan template"]
}

Several keywords, more results each

{
    "queries": ["machine learning", "data science", "neural networks"],
    "maxResultsPerQuery": 250
}

Advanced — language preference and a deep sweep

{
    "queries": ["recetas de cocina", "plan de negocios"],
    "maxResultsPerQuery": 1000,
    "language": "4"
}

Input Reference

All fields are optional — run with just a keyword and sensible defaults handle the rest.

Parameter	Type	Default	Description
`queries`	string[]	`["business plan template"]`	One or more keywords to search on Scribd. Each keyword runs its own search — add several to cover a whole topic in one run.
`maxResultsPerQuery`	integer	`100`	How many documents to return per keyword. Set to `0` to fetch every available match. Results arrive in pages of 40, so the final page may slightly overshoot rather than cut off mid-page.
`language`	select	`Any language`	Prefer results written in a chosen language — English, Spanish, Portuguese, French, German, Italian, Dutch, Russian, Japanese, Korean, Chinese, Arabic, Hindi, Indonesian, Turkish, Polish, Danish, Romanian, Thai, Swedish, or Czech. Leave on "Any language" for no preference. Coverage depends on how much content Scribd has in that language for your keyword.

Output

Each matching document becomes one flat row. Here's a representative result:

{
    "id": "238702049",
    "title": "Sample Business Plan Template",
    "author": "Jane Author",
    "authorUrl": "https://www.scribd.com/user/12345678/jane-author",
    "authors": [
        { "id": 12345678, "name": "Jane Author", "url": "https://www.scribd.com/user/12345678/jane-author" }
    ],
    "type": "document",
    "description": "A complete business plan template covering executive summary, market analysis, and financials...",
    "url": "https://www.scribd.com/document/238702049/Sample-Business-Plan-Template",
    "downloadUrl": "https://www.scribd.com/document_downloads/238702049",
    "imageUrl": "https://imgv2-1-f.scribdassets.com/img/document/238702049/original.jpg",
    "pageCount": 32,
    "releasedAt": "2018-04-12",
    "views": 15000,
    "consumptionTime": 24,
    "isUnlocked": true,
    "rating": 4.5,
    "upvoteCount": 90,
    "downvoteCount": 10,
    "ratingCount": 100,
    "language": "English",
    "languageIso": "en",
    "categories": ["Business", "Templates"],
    "query": "business plan template"
}

Document Fields

Field	Type	Description
`id`	string	Scribd document identifier
`title`	string	Document title
`type`	string	Document type label as classified by Scribd
`description`	string	Description or snippet
`pageCount`	integer	Number of pages (null for non-paged content)
`releasedAt`	string	Publication or upload date
`consumptionTime`	integer	Estimated reading time in minutes
`isUnlocked`	boolean	Whether the document is freely accessible
`categories`	string[]	Category labels Scribd files the document under
`query`	string	The search keyword that surfaced this row

Author Fields

Field	Type	Description
`author`	string	Primary author name (may be null)
`authorUrl`	string	Primary author profile URL
`authors`	object[]	All contributors, each with `id`, `name`, and profile `url`

Engagement & Ratings

Field	Type	Description
`views`	integer	View count, parsed to a plain integer
`rating`	number	Derived 0–5 star rating from community votes
`upvoteCount`	integer	Number of upvotes
`downvoteCount`	integer	Number of downvotes
`ratingCount`	integer	Total ratings cast
`language`	string	Language name
`languageIso`	string	ISO language code

Links & Media

Field	Type	Description
`url`	string	Canonical Scribd reader URL
`downloadUrl`	string	Direct download link when available
`imageUrl`	string	Cover thumbnail image URL

Tips for Best Results

Use specific multi-word phrases to narrow large topics — a broad single word like "business" returns tens of thousands of loosely related documents, while "small business marketing plan" returns a focused, usable set.
Batch related keywords in one run — the query field tags every row with its source keyword, so you can split one mixed dataset back out per term afterward.
Start with a small maxResultsPerQuery (40–100) to confirm the results match your intent, then raise it once you're happy with the keywords.
Set maxResultsPerQuery to 0 only when you genuinely want the full match set — it sweeps deep and is best paired with tight, specific phrases.
Treat language as a preference, not a hard filter — for keywords with little Scribd content in a given language, results fall back to the most available language; pair a language with a keyword written in that language for the best hit rate.
Rank by views and rating together — a high view count with a strong derived star score is the surest sign a document is both popular and well received.
Use pageCount and consumptionTime to pre-screen depth before opening anything — filter out one-page stubs or zero in on long-form references in seconds.

Pricing

From $3.50 per 1,000 results — undercuts comparable Scribd search scrapers while lifting the result ceiling 100×. No compute or time-based charges — you pay per result, plus a small fixed per-run start fee. Bronze, Silver, and Gold subscribers pay progressively less; the table below shows total cost at each discount tier.

Results	No discount	Bronze	Silver	Gold
100	$0.42	$0.40	$0.38	$0.35
1,000	$4.20	$3.95	$3.75	$3.50
10,000	$42.00	$39.50	$37.50	$35.00
100,000	$420.00	$395.00	$375.00	$350.00

A "result" is any document row in the output dataset. The fixed per-run start fee and any platform usage (storage) are additional and depend on your Apify plan.

Integrations

Export data in JSON, CSV, Excel, XML, or RSS. Connect to 1,500+ apps via:

Zapier / Make / n8n — Workflow automation
Google Sheets — Direct spreadsheet export
Slack / Email — Notifications on new results
Webhooks — Trigger custom APIs on run completion
Apify API — Full programmatic access

Legal & Ethical Use

This actor is designed for legitimate research, market analysis, content curation, and lead generation. Users are responsible for complying with applicable laws and Scribd's Terms of Service. Only collect publicly available document metadata, respect copyright and authors' rights, and do not use extracted data for spam, harassment, or any unlawful purpose.

Scribd Scraper

jupri/scribd

💫 Search Scribd.com Documents

cat

152

Scribd Scraper - Low-cost💲🔥📄📚

delectable_incubator/scribd-scraper-low-cost

📚🔍 Extract public Scribd documents by keyword with ease. Collect document titles, authors, page counts, upload dates, descriptions, categories, thumbnails, and document URLs. Ideal for academic research, content discovery, document indexing, knowledge management, and research dataset creation 📊🚀

Prime Scrape

5.0

Scribd Document Search Scraper 🔍📄📚 - Cheap

scrapestorm/scribd-document-search-scraper---cheap

📚 Quickly extract public Scribd documents using any keyword — ideal for researchers, curators, and analysts 🔍 Just enter a search term to get structured data like title, author, page count & upload date. Perfect for building content libraries or powering research tools 📈

Storm_Scraper

5.0

Scribd Document Search Scraper 🔍📄📚 - Pay Per result

scrapestorm/scribd-document-search-scraper---pay-per-result

Storm_Scraper

Elite Document Ocr Lite

thepattyroller/elite-document-ocr-lite

Basic document text extraction and processing. Extract text from documents, analyze document structure, and extract structured data from invoices and receipts. Perfect for document automation workflows.

Logan Kiser

Ai Document Qa

vivid_astronaut/ai-document-qa

BRAINIALL Team

PDF & Document to Markdown - PDF, DOCX & HTML for LLMs

entranced_gelato/ai-document-reader

Turn any PDF, DOCX, TXT, or HTML document into clean, LLM-ready text + Markdown with metadata (title, pages, word count) and an optional AI summary. The document counterpart to a web reader — built for RAG ingestion, document Q&A, and AI agents (LangChain, LlamaIndex). Fast, structured, single-call.

AIDevs

Musescore Sheet Music Scraper

powerai/musescore-sheet-scraper

Scrape sheet music from Musescore.com by providing a search URL, with automatic pagination and comprehensive sheet music information extraction.

PowerAI

Goodreads Book Scraper - Ratings & Reviews

lulzasaur/goodreads-books-scraper

Scrape book data from Goodreads. Search by title or author. Extract ratings, reviews, page count, ISBN, genres, description, author info, and similar books from the world's largest book community.

lulz bot

Goodreads Book Scraper - Metadata, Ratings & Reviews

klondikeking/goodreads-book-scraper

Extract book metadata, ratings, reviews, and author information from Goodreads. Get structured data including title, author, ISBN, rating, review count, description, and cover image. Ideal for book market research, catalog building, and literary analytics.