Scrape Pdf To Markdown — Data, Details & Metadata
Pricing
Pay per usage
Go to Apify Store

Scrape Pdf To Markdown — Data, Details & Metadata
Scrape pdf to markdown data at scale with this powerful Apify actor. Extracts data, details & metadata with automatic pagination and proxy rotation. Perfect for market research, competitive intelligence, and data-driven decision making.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
PDF to Markdown Converter
Convert PDF documents to clean markdown text. Perfect for RAG pipelines, LLM training data, and document processing.
Features
- Fetch PDFs from any public URL
- Extract text and convert to markdown format
- Detect headers, paragraphs, and lists
- Handles multi-page PDFs
- Includes metadata: page count, word count, character count
Input
{"pdfUrls": ["https://arxiv.org/pdf/2301.00234"],"maxPages": 50}
Output
Each PDF produces a dataset item with:
url- Source PDF URLtitle- PDF title from metadata or first linemarkdown- Converted markdown contentpageCount- Number of pages in the PDFwordCount- Number of words extractedcharCount- Number of characters extractedconvertedAt- ISO timestamp