Pricing

$0.005 / actor start

Pdf Page Splitter

Split any PDF into individual pages instantly. Extract all pages, specific pages (1,3,5), or ranges (1-5). Handles up to 50,000 pages. Flat $0.005 per run. Perfect first step for document processing pipelines — chain with OCR, table extraction, and text analysis actors.

Pricing

$0.005 / actor start

Rating

0.0

(0)

Developer

Vivian Ferreira

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

✂️ PDF Page Splitter

Split any PDF into individual pages — then chain into 100+ Apify actors for extraction, OCR, or analysis.

What does PDF Page Splitter do?

PDF Page Splitter takes a multi-page PDF and splits it into individual single-page PDFs. You can extract all pages, pick specific pages (e.g., 1,3,5), or select a range (e.g., 1-5).

This is the essential first step in any document processing pipeline.

Key features

✅ Flexible page selection — all, 1, 1,3,5, 1-5, 1,3-5,7
✅ Two output modes — Dataset (base64 JSON for chaining) or Key-Value Store (binary files for download)
✅ Upload or URL — Upload a PDF directly or provide a URL
✅ Memory efficient — Processes pages one-by-one with periodic cleanup
✅ Lightweight — Runs on just 256 MB of memory
✅ Handles massive PDFs — Up to 50,000 pages / 500 MB
✅ Flat pricing — One flat fee per run, no matter how many pages

Use cases

Use Case	Pages Setting	Output Format
Split batched invoices for processing	`all`	Dataset (base64)
Extract a specific contract page for review	`3`	Key-Value Store
Pull first 5 pages from a report	`1-5`	Dataset (base64)
Cherry-pick pages from a large document	`1,5,10-15,20`	Key-Value Store

Who is this for?

n8n / Make.com automation builders who process documents in workflows
Finance teams splitting batched invoices and payment advices
Legal teams extracting specific pages from large contracts
AI/RAG pipeline developers who need individual pages for processing

Input

Field	Type	Default	Description
`pdfFile`	File Upload	—	Upload a PDF file directly
`pdfUrl`	String	—	URL to a PDF file (for chaining from other actors)
`pages`	String	`all`	Page selection: `all`, `1`, `1,3,5`, `1-5`, `1,3-5,7`
`outputFormat`	Enum	`dataset_base64`	`dataset_base64` (JSON with base64) or `key_value_store` (binary files)

Note: Provide either pdfFile or pdfUrl. If both are provided, pdfFile takes priority.

Page selection examples

all          → Extract every page
1            → Just the first page
1,3,5        → Pages 1, 3, and 5
1-5          → Pages 1 through 5
1,3-5,7      → Pages 1, 3, 4, 5, and 7

Output

Dataset output (`dataset_base64`)

Each page becomes one row in the dataset:

{
    "page_number": 1,
    "filename": "page_1_invoice.pdf",
    "size_bytes": 12345,
    "content_base64": "JVBERi0xLjQK...",
    "original_filename": "invoice.pdf",
    "total_pages": 10
}

Key-Value Store output (`key_value_store`)

Each page is saved as a binary PDF file in the default Key-Value Store:

page_1_invoice.pdf
page_2_invoice.pdf
page_3_invoice.pdf
etc.

Pricing

This Actor uses flat per-run pricing:

Event	Price
`run-completed` (per run, any number of pages)	$0.005

Example costs:

Split a 10-page PDF → $0.005
Extract 3 specific pages → $0.005
Split a 10,000-page document → $0.005

Same price whether you split 1 page or 10,000. Platform compute costs vary by memory and run time.

Chaining with other actors

PDF Page Splitter is designed as the gateway to document processing pipelines:

┌─→ Resume Text Extractor
                    │
PDF Page Splitter ──┼─→ Indian Payment Advice Parser
                    │
                    ├─→ Document Table Extractor
                    │
                    └─→ PDF to PNG Converter

Input sources

Any file download actor
Gmail attachment scraper
Website crawler (PDF links)
Manual upload

Output destinations

Resume Text Extractor — Extract structured text from resume PDFs
Indian Payment Advice Parser — Parse payment details from bank advices
Document Table Extractor — Extract tables from document pages
PDF to PNG Converter — Convert pages to images for OCR or vision AI

Integration with n8n / Make.com

n8n HTTP Request Node

Set Method to POST
Use the Apify API endpoint to run this actor
Pass your PDF as binary data or provide a URL
Use dataset_base64 output for easy downstream processing

Make.com

Use the Apify module to run this actor
Pass your PDF URL in the pdfUrl input field
Iterate over dataset results for downstream processing

Limits

Resource	Limit
File size	500 MB
Page count	50,000 pages
Default memory	256 MB
Max memory	4096 MB

Memory recommendations

PDF Size	Pages	Recommended Memory
≤ 10 MB	≤ 100	256 MB
≤ 50 MB	≤ 500	512 MB
≤ 200 MB	≤ 2,000	1024 MB
≤ 500 MB	≤ 5,000	2048 MB
> 500 MB	> 5,000	4096 MB

Tip: For PDFs with 1,000+ pages, use key_value_store output format instead of dataset_base64 to avoid base64 encoding overhead.

Changelog

v0.2 (Large PDF Support)

Flat per-run pricing ($0.005 per run, unlimited pages)
Raised limits: 50,000 pages, 500 MB file size, 4 GB memory
Streaming download for large files
Batch dataset pushes (50 rows per API call)
Aggressive memory cleanup with gc for 10K+ page PDFs
Progress logging every 100 pages
Memory recommendations in logs

v0.1 (Initial Release)

PDF splitting with page selection (all, ranges, specific pages)
Two output modes: Dataset (base64) and Key-Value Store (binary)
File upload and URL input support
Memory-efficient processing with periodic cleanup

PDF Text Extractor – PDF to Text, Metadata & Pages

haketa/pdf-text-extractor

Extract clean text and metadata from any PDF by URL: full text, per-page text, page count, title, author, dates and producer. No browser, no OCR needed for text PDFs. Ideal for AI/RAG, search and document data extraction. Export to JSON, CSV or Excel.

Haketa

Fast Pdf Processor

contemporary_fruit/pdf-processor-actor

This API is a PDF Processing Service allowing users to upload a PDF to: Extract Text: Reads all text from the PDF and returns it as structured JSON data per page. Merge Pages: Creates a new PDF containing only the specific pages selected by the user. (260 characters)

Andric

PDF Extract — Text, Tables & Metadata (OCR-ready)

sathvic_kollu/techtenstein-pdf-extract

Extract clean text, structured tables, and metadata from any PDF URL. Supports OCR for scanned documents. Ideal for building document pipelines, financial data extraction, invoice processing, and research automation.

Techtenstein Services Private Limited

PDF Text Extractor - Text, Metadata & Page Count from PDF URL

ninhothedev/pdf-text-extractor

$0.5/1K 🔥 PDF text extractor API! Extract full text, metadata & page count from any PDF URL — ready for RAG, LLMs & AI pipelines. No API key. Export JSON, CSV, Excel or API in seconds ⚡

ninhothedev

PDF Parser API

george.the.developer/pdf-parser-api

Instant API that parses any PDF from a URL — extracts full text, page count, metadata (title, author, dates), and PDF version. Returns structured JSON. Perfect for document processing pipelines and AI agents.

George Kioko

PDF Tools (Merge / Split / Compress / OCR / Watermark)

mrkrokko/pdf-tools

All-in-one PDF processor: merge multiple PDFs, split by page ranges, compress file size, extract text, OCR scanned documents (Tesseract), add text watermarks, rotate pages, and read metadata. Accepts PDF URLs or Key-Value Store keys.

Alex O

PDF Text Extractor - Bulk PDF to Text & Metadata

santamaria-automations/pdf-extractor

Extract text and metadata from any PDF URL in bulk. Get page content, author, title, creation date, and more. Detects scanned PDFs that need OCR. Perfect for document analysis, research, and compliance.

NanoScrape

PDF OCR API - Document Extraction

alizarin_refrigerator-owner/pdf-ocr-api

Extract text from PDFs including scanned documents. OCR processing, table extraction & structured data output. Process invoices, contracts & forms at scale.