Under maintenance

Pricing

from $0.01 / 1,000 results

Try for free

Go to Apify Store

Extractor from PDF URL

Under maintenance

Try for free

Extract text and tables from PDFs in a clear, readable format. Provides well-organized tables and cleans up messy spacing, making PDF content easy to view, copy, or share—directly from a PDF link.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Muhammad Zain Abid

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

PDF Text & Table Extractor (JavaScript)

Extract readable text and clean tables from any PDF by simply providing a URL. This Actor downloads the PDF, parses the content, and formats messy spacing so the result is easy to view, copy, or process.

Included features

PDF fetching from a provided URL
Text extraction using pdf-parse
Cleaning & formatting to improve table readability
JSON output to Apify key-value store

Use cases

Convert PDF content into clean text
Make tables readable for reporting or copying
Prepare data for further processing or automation
Quickly inspect PDF content without opening the file

Input

Provide the following input JSON:

{
    "pdfUrl": "https://assets.accessible-digital-documents.com/uploads/2017/01/sample-tables.pdf"
}

Output

The Actor returns:

Full extracted text saved to: KEY_VALUE_STORE/result
Short preview in the API response

Example:

{
  "status": "success",
  "extracted": "Table 1\nColumn 1 Column 2 Column 3..."
}

Getting started

Build the Actor in Apify Console
Supply a PDF URL as input
Run and view extracted output in the key-value store

Local development (optional)

Use Apify CLI to pull and edit locally:

npm -g install apify-cli
apify pull <ActorId>

Dependencies

Apify SDK – actor environment & input/output handling
node-fetch – PDF downloading
pdf-parse – text extraction

Documentation reference

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

Onidivo Technologies

496

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

Akash Kumar Naik

Pdf API

vivid_astronaut/pdf

Fabio Suizu

PDF Extractor 2.0

jupri/pdf-extractor-2-0

💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.

cat

165

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

codemaster devops

5.0

Fast Pdf Processor

contemporary_fruit/pdf-processor-actor

This API is a PDF Processing Service allowing users to upload a PDF to: Extract Text: Reads all text from the PDF and returns it as structured JSON data per page. Merge Pages: Creates a new PDF containing only the specific pages selected by the user. (260 characters)

Andric

PDF to Markdown RAG-Ready

hedelka/pdf-to-markdown-rag

Premium PDF scraper that preserves tables and structure. Optimized for RAG.

Dmitry Goncharov

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

Jiří Moravčík

985

5.0

PDF AI Extractor MCP

devaditya/pdf-ai-extractor-mcp

Extracts text, tables, summaries, and structured data from any PDF using OpenAI, Google Gemini, or Claude. Supports bulk AI processing, clean JSON exports, and an AI-ready MCP mode for agent workflows.

lalithhh

PDF To JSON Parser

parseforge/pdf-to-json-parser

Convert PDF documents into structured JSON using AI-powered OCR and smart data extraction. The Actor processes every page to ensure complete coverage, then identifies text, fields, tables, and key details, delivering clean, organized JSON ready for automation or analysis.