PDF Toolkit — Extract Text, Metadata & Page Count avatar

PDF Toolkit — Extract Text, Metadata & Page Count

Pricing

$4.00 / 1,000 pdf processeds

Go to Apify Store
PDF Toolkit — Extract Text, Metadata & Page Count

PDF Toolkit — Extract Text, Metadata & Page Count

Extract text from PDFs, read metadata (title, author, dates), count pages. Bulk processing from URLs. $0.003 per PDF.

Pricing

$4.00 / 1,000 pdf processeds

Rating

0.0

(0)

Developer

Manchitt Sanan

Manchitt Sanan

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

15 days ago

Last modified

Categories

Share

Process PDFs from URLs in bulk. Extract full text content, read document metadata (title, author, creation date), and count pages. $0.003 per PDF processed.


Operations

OperationWhat it returns
extract-textFull text content + page count
get-metadataTitle, author, subject, creator, producer, creation/modification dates + page count
page-countNumber of pages only

Quick start

{
"items": [
{
"url": "https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf",
"operation": "extract-text"
}
]
}

Input

Each item in the items array:

FieldTypeRequiredDescription
urlstringYesURL to a PDF file
operationenumYesextract-text, get-metadata, or page-count

Output

{
"url": "https://example.com/document.pdf",
"operation": "extract-text",
"text": "Full extracted text content...",
"pageCount": 12,
"fileSize": 245760,
"status": "success",
"error": null
}

Pricing

$0.003 per PDF processed (pay-per-event pricing).

  • Errors and dry runs are never charged.
  • 100 PDFs = $0.30
  • 1,000 PDFs = $3.00

Limitations

  • Text extraction only — no OCR. Scanned PDFs (images of text) will return empty or minimal text.
  • Max file size depends on Apify memory allocation. Default 256MB handles most PDFs.
  • No PDF generation — this actor reads PDFs, doesn't create them. Use Apify's official HTML-to-PDF actor for generation.

Other tools by accurate_pouch for content + asset processing:

  • QR Code Toolkit — Generate + decode, custom colors, logos, SVG/PNG/base64. $0.004/QR.
  • TheCrawler — Web scraper + LLM-powered structured extraction, includes PDF + DOCX. AGPL-3.0, also on npm (thecrawler@0.1.1). $0.005/page.
  • Google Sheets R/W — Read, append, replace, modify, backup. $0.004/op.
  • Broken Link Checker — Recursive crawl, sitemap + robots.txt parsing, webhook, Sheets export. $0.005/page.

Run on Apify

Run on Apify

No setup needed. Click above to run in the cloud. $0.003 per operation.