Pricing

$0.01 / actor start

Pandoc Document Converter

Convert documents between formats (HTML, Markdown, DOCX, EPUB, PDF, LaTeX, RST, ODT, PPTX) using Pandoc. Accepts raw text or URL input.

Pricing

$0.01 / actor start

Rating

0.0

(0)

Developer

Monkey Coder

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

📄 Pandoc Document Converter

Convert documents between multiple formats using the powerful Pandoc document conversion engine. Supports HTML, Markdown, DOCX, EPUB, PDF, LaTeX, RST, ODT, PPTX, and more.

✨ Features

20+ format support — Convert between HTML, Markdown, GFM, CommonMark, LaTeX, RST, DOCX, EPUB, ODT, PPTX, PDF, plain text, AsciiDoc, MediaWiki, Org-mode, and more
URL input — Fetch content directly from a URL and convert it
Raw text input — Paste HTML, Markdown, or any supported format directly
Binary output — DOCX, EPUB, ODT, PPTX, and PDF files are saved to the key-value store for easy download
PDF generation — Powered by WeasyPrint (no heavy LaTeX installation needed)
Standalone mode — Produce complete documents with proper headers and footers

🔧 How It Works

You provide content (raw text or a URL to fetch from)
You specify the input format and desired output format
The Actor runs Pandoc CLI to perform the conversion
Text output (HTML, Markdown, etc.) is returned in the dataset
Binary output (DOCX, EPUB, PDF, etc.) is saved to the key-value store and base64-encoded in the dataset

🚀 How to Use

Set input — Either paste content in the "Content" field or enter a URL in "Source URL"
Choose formats — Set "Input Format" (e.g., html) and "Output Format" (e.g., markdown)
Run the Actor
Get results — Check the dataset for text output, or download binary files from the key-value store

Common Conversions

From	To	Use Case
`html`	`markdown`	Convert web pages to Markdown
`markdown`	`html`	Render Markdown as HTML
`html`	`docx`	Save web content as Word document
`markdown`	`docx`	Create Word documents from Markdown
`html`	`epub`	Convert articles to e-book format
`markdown`	`pdf`	Generate PDF from Markdown
`html`	`plain`	Strip HTML tags, extract plain text
`latex`	`html`	Convert LaTeX papers to web format
`html`	`rst`	Convert to reStructuredText

📊 Sample Output (text conversion)

{
    "from_format": "html",
    "to_format": "markdown",
    "input_size_bytes": 245,
    "output_size_bytes": 128,
    "output_type": "text",
    "output": "# Hello World\n\nThis is a **sample HTML** document for conversion.\n\n-   Item 1\n-   Item 2\n-   Item 3\n",
    "converted_at": "2026-03-20T08:30:00.000000"
}

📊 Sample Output (binary conversion)

{
    "from_format": "html",
    "to_format": "docx",
    "input_size_bytes": 245,
    "output_size_bytes": 8432,
    "output_type": "binary",
    "output_base64": "UEsDBBQAAAAI...",
    "download_key": "output.docx",
    "converted_at": "2026-03-20T08:30:00.000000"
}

Binary files (DOCX, EPUB, ODT, PPTX, PDF) are also saved to the key-value store with the key output.<format> for direct download.

📝 Input Formats

html, markdown, gfm (GitHub Flavored Markdown), commonmark, latex, rst, textile, org, mediawiki, json (Pandoc AST)

📤 Output Formats

html, markdown, gfm, commonmark, latex, rst, plain, docx, epub, odt, pptx, asciidoc, mediawiki, org, pdf

⚠️ Notes

Input size limit: 10 MB maximum
PDF output: Uses WeasyPrint engine (supports CSS styling, no LaTeX needed)
Binary output: Files are base64-encoded in the dataset AND saved to the key-value store for direct download
URL fetching: Basic HTTP GET with browser-like User-Agent. Sites with advanced anti-bot protection may not work.
Memory: Recommended 1 GB for large documents or PDF generation

Pandoc Document Converter - HTML to Markdown, DOCX, EPUB, PPTX

scrapeworks/pandoc-document-converter

Convert documents between formats with Pandoc in the cloud: HTML to Markdown for LLMs and RAG, Markdown to Word DOCX, EPUB e-books, PowerPoint PPTX, LaTeX, reStructuredText and more. Feed it URLs or raw text, get one converted document per input.

Nicolas van Arkens

Document Format Converter

junipr/document-format-converter

Convert documents between Markdown, HTML, EPUB, DOCX, PPTX, PDF, and more. Powered by Pandoc. Supports 50+ format pairs, TOC generation, and custom Pandoc args.

junipr

Pandoc Universal Mcp

whitewalk/pandoc-universal-mcp

Convert documents between 40+ formats via MCP. Markdown, DOCX, PDF, HTML, LaTeX, EPUB, PPTX & more. Academic support with citations, bibliography & math. Batch conversion. Perfect for AI agents & Claude Desktop integration.

seena Singh

Pandoc Document Converter

incredible_moment/pandoc-actor

Universal document converter. Transform Markdown, HTML, and text to PDF, DOCX, EPUB, and more. High-performance Rust wrapper for the Pandoc engine ensures fast execution and low memory footprint.

Daniel Rosen

Universal Document Format Transformer

actorify/universal-document-format-transformer

Universal Document Format Transformer: a cloud-based Apify Actor that converts documents (PDF, DOCX, PPTX, HTML, TXT) into Markdown, JSON, CSV, HTML or TXT using Pandoc. Easy REST API for automations (n8n, Zapier, Make), production-ready error handling, and security controls.

fanio zilla

Document Parser — PDF/DOCX to Markdown & JSON for RAG

genuine_qa/document-parser

Convert PDF, DOCX, PPTX, XLSX, HTML and images into clean Markdown or JSON for RAG and LLM pipelines. Powered by IBM's open-source Docling.

Rahul Bhiwagade

RAG Document Converter

web.harvester/rag-document-converter

Convert PDF, DOCX, PPTX, and other documents to clean Markdown optimized for RAG pipelines. Preserves structure, tables, and headers. Powered by IBM Docling.

Web Harvester

PDF to MP3 - Convert PDF, EPUB, DOCX & Text to Audiobook

marielise.dev/pdf-to-mp3

Convert PDF, EPUB, DOCX, Markdown, HTML, TXT, and RTF to MP3 audiobooks. Free Microsoft Edge TTS (no API key) with OCR for scanned PDFs, 70+ languages, and optional OpenAI or ElevenLabs voices. ~$0.04/min.