
PDF Extractor 2.0
Pricing
$30.00/month + usage

PDF Extractor 2.0
💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.
0.0 (0)
Pricing
$30.00/month + usage
2
Monthly users
5
Runs succeeded
>99%
Last modified
4 months ago
Welcome to PDF Extractor

🍂 About PDF Format
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.[2][3] Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991.[4] PDF was standardized as ISO 32000 in 2008.[5] The last edition as ISO 32000-2:2020 was published in December 2020.
🍂 About This Actor
💫 Extract contents from PDF documents
Features :
- ⭐ Extract PDF pages as Text or Image (SVG, PNG, JPEG).
- ⭐ Extract PDF Metadata.
- ⭐ Extract PDF Table of Contents
- ⭐ Extract PDF Tables
- ⭐ Extract Encrypted PDF (password protected)
- ⭐ Extract Embedded images.
- ⭐ Extract Attachments.
- ⭐ Extract multiple URL files
🍂 Tutorial
Input Parameters
Name | Type | Description |
---|---|---|
url | Array [String] | List of PDF document URL |
content | String | Output pages format (text, svg, png, jpg ) |
images | Boolean (true/false) | Extract embedded images |
attachments | Boolean (true/false) | Extract embedded files |
tables | Boolean (true/false) | Extract tables |
Notes : All extracted resources other than TEXT will be saved to default Key-Value storage.
Dataset Output Format :
1[ 2 # URL-1: Metadata 3 { "metadata": { "headers": { ... }, "url": "...", "mime": "..." } }, 4 # URL-1: Page Contents 5 { "index": 0, "content": "...page-0 contents...", "images": [...], "tables": [...] }, 6 { "index": 1, "content": "...page-1 contents...", "images": [...], "tables": [...] }, 7 ... 8 # URL-2: Metadata 9 { "metadata": { "headers": { ... }, "url": "...", "mime": "..." } }, 10 # URL-2: Page Contents 11 { "index": 0, "content": "...page-0 contents...", "images": [...], "tables": [...] }, 12 { "index": 1, "content": "...page-1 contents...", "images": [...], "tables": [...] }, 13 ... 14]
🍂 Output Samples
PDF Sample #1
URL : https://www.w3.org/WAI/WCAG21/working-examples/pdf-table/table.pdf
1{ 2 3}
PDF Sample #2
URL : https://apify.com/img/web-scraping/beginners-guide-to-web-scraping.pdf
1{ 2 3}
✏️ Support
⚡️ Feel free to reach out to the developer for any issues or suggestions for improvement.

Pricing
Pricing model
RentalTo use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.
Free trial
7 days
Price
$30.00