{
    "pdfUrl": [
        {
            "url": "http://www.pdf995.com/samples/pdf.pdf"
        }
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
}

We recommend using the proxies to overcome blocking and detection if required.

Output

The extracted text is saved to the dataset, and it looks like this:

[
    {
        "pdfUrl": "http://www.pdf995.com/samples/pdf.pdf",
	"extractedText": "\n\n\n\n\n\n\n\n\nThe pdf995 suite of products - Pdf995, PdfEdit995, and Signature995 - is a complete solution for your document publishing needs. It provides ease of use, flexibility in format, and industry-standard security- and all at no cost to you.\nPdf995 makes it easy and affordable to create professional-quality documents in the popular PDF file format. Its easy-to-use interface helps you to create PDF files by simply selecting the \"print\" command from any application, creating documents which can be viewed on any computer with a PDF viewer. Pdf995 supports network file saving, fast user switching on XP, Citrix/Terminal Server, custom page sizes and large format printing. Pdf995 is a printer...",
        "extractedTextFileUrl": ""
    }
]

On this page

PDF Scraper

Share Actor:

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

Jiří Moravčík

742

5.0

PDF Extractor 2.0

jupri/pdf-extractor-2-0

💫 Extract PDF Document Contents including Metadata, Images, Pages, Tables, Attachments, etc.

cat

112

HTML to PDF converter

apify/html-to-pdf-converter

Convert HTML string to A4 PDF.

Apify

4.3

PDF Text Extractor

sami_apify/PDF-Text-Extractor

This actor downloads PDFs from provided URLs, extracts text content from them, and saves the extracted data into an Apify dataset. It’s ideal for scraping and processing PDFs available online.

sami

HTML to PDF Converter

jancurn/url-to-pdf

Loads a web page in headless Chrome using Puppeteer and prints it to PDF. The input is a JSON object and output is a PDF file.

Jan Čurn

478

HTML string to PDF

mhamas/html-string-to-pdf

Convert HTML string to A4 PDF.

Matej Hamas

HTML to PDF Converter Pro 🔄

powerful_bachelor/html-to-pdf-converter-pro

🔄 Convert web pages to high-quality PDFs with special canvas element handling! Perfect for 📄 documentation, 🖨️ printing, and 🔒 archiving. Features include batch processing and flexible page settings. Transform your web content into professional PDFs! 🚀

Powerful Bachelor

Markdown Converter

jindrich.bar/markdown-converter

A simple Actor for converting pdf / doc / docx files to Markdown.

Jindřich Bär

Website Media Link Scraper

thenetaji/website-media-link-scraper

Quickly find video, audio, docs, pdf, image and more links from websites using this fast and lightweight web crawler. No browser needed—just clean and efficient media extraction.

thenetaji

116

4.3

Website To PDF Converter

louisdeconinck/website-to-pdf-converter

Convert websites to high-quality PDF documents with customizable options. This powerful actor allows you to transform website pages with both static HTML and dynamic content into professional-grade PDFs, offering a wide range of customization features such as page format, orientation, margins, …