PDF Toolkit — Extract Text, Metadata & Page Count
Pricing
$4.00 / 1,000 pdf processeds
Go to Apify Store
PDF Toolkit — Extract Text, Metadata & Page Count
Extract text from PDFs, read metadata (title, author, dates), count pages. Bulk processing from URLs. $0.003 per PDF.
Pricing
$4.00 / 1,000 pdf processeds
Rating
0.0
(0)
Developer
Manchitt Sanan
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
15 days ago
Last modified
Categories
Share
Process PDFs from URLs in bulk. Extract full text content, read document metadata (title, author, creation date), and count pages. $0.003 per PDF processed.
Operations
| Operation | What it returns |
|---|---|
extract-text | Full text content + page count |
get-metadata | Title, author, subject, creator, producer, creation/modification dates + page count |
page-count | Number of pages only |
Quick start
{"items": [{"url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf","operation": "extract-text"}]}
Input
Each item in the items array:
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | URL to a PDF file |
operation | enum | Yes | extract-text, get-metadata, or page-count |
Output
{"url": "https://example.com/document.pdf","operation": "extract-text","text": "Full extracted text content...","pageCount": 12,"fileSize": 245760,"status": "success","error": null}
Pricing
$0.003 per PDF processed (pay-per-event pricing).
- Errors and dry runs are never charged.
- 100 PDFs = $0.30
- 1,000 PDFs = $3.00
Limitations
- Text extraction only — no OCR. Scanned PDFs (images of text) will return empty or minimal text.
- Max file size depends on Apify memory allocation. Default 256MB handles most PDFs.
- No PDF generation — this actor reads PDFs, doesn't create them. Use Apify's official HTML-to-PDF actor for generation.
Related actors in this suite
Other tools by accurate_pouch for content + asset processing:
- QR Code Toolkit — Generate + decode, custom colors, logos, SVG/PNG/base64. $0.004/QR.
- TheCrawler — Web scraper + LLM-powered structured extraction, includes PDF + DOCX. AGPL-3.0, also on npm (
thecrawler@0.1.1). $0.005/page. - Google Sheets R/W — Read, append, replace, modify, backup. $0.004/op.
- Broken Link Checker — Recursive crawl, sitemap + robots.txt parsing, webhook, Sheets export. $0.005/page.
Run on Apify
No setup needed. Click above to run in the cloud. $0.003 per operation.
