Any-to-Markdown
Pricing
from $150.00 / 1,000 pages
Any-to-Markdown
Any-to-Markdown converts images and scanned PDFs into structured Markdown using AI-powered OCR.
Pricing
from $150.00 / 1,000 pages
Rating
0.0
(0)
Developer

AbotAPI
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
What does Any-to-Markdown do?
Any-to-Markdown converts images and scanned PDFs into structured Markdown using AI-powered OCR. Unlike basic OCR tools, it understands complex document layouts and recognizing mathematical formulas (LaTeX), tables, figures, and text in their correct reading order.
It supports 80+ languages and handles academic papers, technical documents, invoices, and any document where layout matters.
Key features
- Formula recognition — Detects mathematical formulas and converts them to LaTeX
- Table extraction — Recognizes table structures and outputs them as Markdown tables
- Layout analysis — Understands document structure (titles, paragraphs, figures, captions)
- Multi-language OCR — Supports 80+ languages including Chinese, Japanese, Korean, Arabic, and more
- PDF support — Process multi-page PDF documents with page selection
- Multiple input methods — Upload files directly or provide URLs
Use cases
- Academic paper digitization — Convert scanned research papers with equations into editable Markdown/LaTeX
- Technical document processing — Parse engineering specs, datasheets, and manuals preserving tables and formulas
- Invoice and receipt parsing — Extract structured data from scanned financial documents
- Book digitization — Convert scanned book pages into searchable, editable text
- Data pipeline integration — Use the Apify API to automate document parsing in your workflows
- RAG / LLM preparation — Convert documents to Markdown for use as context in AI applications
How to use
- Upload files or provide URLs to images (PNG, JPEG, BMP, WebP) or PDF documents
- Select a recognition mode based on your content type
- Choose your languages and configure options
- Run the Actor and get structured Markdown output
Recognition modes
| Mode | Best for | Description |
|---|---|---|
| Page (default) | General documents | Full layout analysis — detects titles, text, formulas, tables, and figures |
| Text + Formula | Math-heavy content | Mixed text and formula recognition without layout analysis |
| Formula Only | Equations | Pure mathematical formula to LaTeX conversion |
| Text Only | Simple documents | Standard OCR without formula detection |
| Multi-page docs | Processes PDF files with optional page selection |
Supported file formats
| Format | Extensions | Notes |
|---|---|---|
.pdf | Multi-page documents, scanned or digital. Use pageNumbers to select specific pages. | |
| PNG | .png | Recommended for screenshots and scanned documents |
| JPEG | .jpg, .jpeg | Photos, scans, camera captures |
| BMP | .bmp | Uncompressed bitmap images |
| TIFF | .tiff, .tif | High-quality scans, multi-layer images |
| WebP | .webp | Modern web image format |
| GIF | .gif | Static GIF images (first frame only) |
Input
The Actor accepts uploaded files or URLs pointing to images or PDFs. All other parameters are optional with sensible defaults.
The most important input fields are:
- Upload Files — Drag and drop image or PDF files directly
- URLs — List of URLs pointing to documents to parse
- Recognition Mode — Select the type of content analysis (default:
page) - Languages — Comma-separated language codes, e.g.
en,ch_sim(default:en,ch_sim)
See the Input Schema tab for the full list of configuration options including table recognition, formula detection, image resize settings, and output format.
Output
The Actor outputs results to the Dataset (JSON) and/or Key-Value Store (Markdown files), depending on the outputFormat setting.
Dataset output example
{"source": "https://example.com/document.png","success": true,"recognitionMode": "page","markdown": "## Introduction\n\nThe equation $E = mc^2$ describes...\n\n| Column A | Column B |\n|----------|----------|\n| Value 1 | Value 2 |","format": "png","sizeBytes": 245000,"processingTimeMs": 35000,"elements": [{ "type": "TITLE", "text_preview": "Introduction" },{ "type": "TEXT", "text_preview": "The equation..." },{ "type": "FORMULA", "text_preview": "E = mc^2" },{ "type": "TABLE", "text_preview": "| Column A |..." }],"metadata": {"parsed_at": "2026-02-19T03:08:04.944383","processing_time_ms": 35000,"file_size_bytes": 245000,"format": "PNG"}}
Key-Value Store output
When using keyValueStore or both output format, each successfully parsed document is saved as a .md file in the Key-Value Store, ready to download or use in downstream workflows.
Tips to optimize costs
- Use Text Only mode for simple documents without formulas — it's faster
- Set Disable Formula Recognition to
trueif you don't need LaTeX output - Use Page Numbers to process only specific pages of large PDFs
- Lower the Resized Shape value (e.g., 512) for faster processing at the cost of some accuracy
Supported languages
The Actor supports 80+ languages via its built-in text recognition engine. When using only English (en) or Simplified Chinese (ch_sim), a dedicated fast OCR engine is used automatically. For all other languages, the full multilingual engine is activated.
Provide multiple languages as comma-separated codes, e.g. en,ch_sim,ja
Common language codes
| Code | Language | Code | Language |
|---|---|---|---|
en | English | ch_sim | Chinese (Simplified) |
ch_tra | Chinese (Traditional) | ja | Japanese |
ko | Korean | vi | Vietnamese |
fr | French | de | German |
es | Spanish | pt | Portuguese |
it | Italian | nl | Dutch |
ru | Russian | uk | Ukrainian |
ar | Arabic | fa | Persian (Farsi) |
hi | Hindi | ta | Tamil |
th | Thai | id | Indonesian |
ms | Malay | tl | Filipino |
tr | Turkish | pl | Polish |
cs | Czech | ro | Romanian |
hu | Hungarian | el | Greek |
sv | Swedish | da | Danish |
no | Norwegian | fi | Finnish |
bn | Bengali | ur | Urdu |
ne | Nepali | mn | Mongolian |
my | Myanmar | ka | Georgian |