Pricing

from $40.00 / 1,000 audio minute generateds

PDF to MP3 - Convert PDF, EPUB, DOCX & Text to Audiobook

Convert PDF, EPUB, DOCX, Markdown, HTML, TXT, and RTF to MP3 audiobooks. Free Microsoft Edge TTS (no API key) with OCR for scanned PDFs, 70+ languages, and optional OpenAI or ElevenLabs voices. ~$0.04/min.

Pricing

from $40.00 / 1,000 audio minute generateds

Rating

0.0

(0)

Developer

Marielise

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

[0.1.1] - 2026-06-08

Production-hardening: OCR, modern PDF engine, security, billing safety

New capabilities

OCR fallback for scanned / image-only PDFs — pages with no text layer are auto-rendered (poppler pdftoppm) and OCR'd (Tesseract: EN, ES, FR, DE, IT, PT, NL) and narrated. New ocr-page-processed billing event ($0.10/page), charged only for pages that actually need OCR. Toggle with enableOcr.
Encrypted PDF support — decrypt password-protected PDFs via pdfPassword.
Proxy support — optional proxyConfiguration for the Document URL fetch.
ID3 tags on every MP3 part (title / album / track / genre=Audiobook).

Engineering

Replaced pdf-parse (2018 PDF.js, unmaintained) with unpdf — current, serverless-friendly PDF.js. Per-page extraction via a direct array index (no render-hook page-order invariant to desync).
Real unit test suite (Node test runner) for chunking, page-range parsing, SSRF address checks, format detection, strippers, key sanitization, voice + OCR-language mapping. ESLint flat config added.

Security

SSRF guard on Document URL fetch: rejects non-http(s) schemes and any host resolving to private / loopback / link-local / CGNAT ranges (incl. cloud metadata 169.254.169.254). Redirects are followed manually and re-validated at every hop.
Added .dockerignore so .env / secrets / local state never enter image layers.

Billing safety

maxCostUsd now also clamps the actual audio-minute charge (not just the pre-flight estimate), so slow-speech / CJK runs can't bill past the cap.
maxCostUsd: 0 (or below the $0.02 floor) is now rejected instead of silently disabling the cap.

[0.1.0] - 2026-06-02

Initial public release as Text to Audio Narrator

Multi-format document narration: PDF, Markdown, plain text, and HTML in, MP3 out.

Supported inputs (7 formats)

PDF (.pdf) — native text-layer extraction (no OCR)
DOCX (.docx) — Word documents via mammoth: styles, lists, tables, footnotes
EPUB (.epub) — ebooks via epub2: spine-ordered chapter walk, HTML stripped per chapter
Markdown (.md, .markdown, .mdx) — syntax stripped before TTS so the voice reads natural prose
Plain text (.txt, .text) — UTF-8 with BOM handling
HTML (.html, .htm, .xhtml) — tags stripped, entities decoded
RTF (.rtf) — control codes stripped, unicode + hex escapes decoded
Raw text paste (new text input field) for blog drafts, ChatGPT replies, READMEs
All four input modes: URL fetch, file upload, base64 paste, raw text paste
ZIP-based formats (DOCX vs EPUB) distinguished by mimetype sniff + extension hint

TTS engines

Edge TTS (free, no API key) — 400+ neural voices, 70+ languages, the recommended default
OpenAI BYOK: gpt-4o-mini-tts (steerable), tts-1, tts-1-hd
ElevenLabs BYOK: flash-v2_5, turbo-v2_5

Quality & reliability

Auto language detection + Edge voice picker
Provider-adaptive concurrency (Edge 8, OpenAI 10, ElevenLabs 2) to avoid 429 storms
Word-boundary-aware chunking — no mid-word audio cuts in long technical paragraphs
Edge TTS WebSocket guarded with 90 s timeout to avoid silent hangs
ffmpeg concat with correct MP3 duration metadata
Resume cache: re-runs skip already-synthesized chunks (no re-paying for TTS already done)
Skip-failed-chunks mode for messy documents (auth / quota errors still abort cleanly)
Friendly errors for scanned / encrypted / password-protected PDFs
Collision-safe KV store key sanitization for long Apify run IDs

Output

Chapter-sized MP3 parts for long books (configurable via maxPartMb)
Shareable INDEX.html page with inline players + download links for every part
PREVIEW key written before TTS starts with pages-to-process + estimated cost
OUTPUT JSON record with indexUrl, audioUrl, parts[], durations, voice, model, cost, status

Pricing & safety

Pay-per-event: actor-start $0.02, per-page $0.05, per-audio-minute $0.03
No provider markups, no premium-voice surcharges
Optional maxCostUsd hard cap aborts cleanly before any TTS if pre-flight estimate exceeds it
For non-PDF formats: ~3000 chars = 1 pseudo-page for fair billing parity with PDF inputs

OCR & Document Extractor – PDF & Image to Text, JSON, Word

lofomachines/ocr-document-extractor

Convert scanned PDFs and images into clean, structured text in bulk. Export to JSON, Markdown, DOCX, TXT or HTML with tables and layout preserved.

Lofomachines

PDF & DOCX to Markdown — Document Extractor for LLM/RAG

fetchbase/document-to-markdown

Convert PDF and Word (DOCX) documents into clean Markdown, text, or JSON. Smart PDF paragraph reflow, page markers for RAG citations, full DOCX structure (headings, lists, tables), custom auth headers. No browser — parses in seconds. Charged per page processed — no startup fee.

Fetchbase

Doc-to-Markdown/JSON RAG Prep - Convert PDF & DOCX for RAG

bigjoecoding/doc-to-markdown-json-rag-prep

Convert PDF, DOCX, PPTX and webpages to clean Markdown and RAG-ready JSON chunks for your embedding pipeline. No LLM cost. $0.03 per document.

Joseph Curry

Document Parser — PDF/DOCX to Markdown & JSON for RAG

genuine_qa/document-parser

Convert PDF, DOCX, PPTX, XLSX, HTML and images into clean Markdown or JSON for RAG and LLM pipelines. Powered by IBM's open-source Docling.

Rahul Bhiwagade

PDF to Markdown Converter

web.harvester/pdf-to-markdown-converter

Convert PDFs to clean Markdown with optional OCR for scanned documents. Uses PDF.js for text extraction and Tesseract.js for optical character recognition.

Web Harvester

Epub To Pdf

flamboyant_inn/epub-to-pdf

Simple actor that takes a public url to a epub file, and converts it to pdf format. Once the run finishes, go to the Storage tab, select Key-value store, and you will see the OUTPUT.pdf file ready for download.

Eric

Pdf API

vivid_astronaut/pdf

Fabio Suizu

Pandoc Document Converter - HTML to Markdown, DOCX, EPUB, PPTX

scrapeworks/pandoc-document-converter

Convert documents between formats with Pandoc in the cloud: HTML to Markdown for LLMs and RAG, Markdown to Word DOCX, EPUB e-books, PowerPoint PPTX, LaTeX, reStructuredText and more. Feed it URLs or raw text, get one converted document per input.

Nicolas van Arkens

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

Onidivo Technologies

518

PDF & Document to Markdown - PDF, DOCX & HTML for LLMs

entranced_gelato/ai-document-reader

Turn any PDF, DOCX, TXT, or HTML document into clean, LLM-ready text + Markdown with metadata (title, pages, word count) and an optional AI summary. The document counterpart to a web reader — built for RAG ingestion, document Q&A, and AI agents (LangChain, LlamaIndex). Fast, structured, single-call.

AIDevs

PDF to MP3 - Convert PDF, EPUB, DOCX & Text to Audiobook

Changelog

[0.1.1] - 2026-06-08