Pricing

Pay per event + usage

Markitdown Mcp Server

Cloud-hosted MCP server converting 29+ document formats (PDF, DOCX, PPTX, images, audio) to AI-ready Markdown. Zero Python setup. Perfect for RAG pipelines and AI agents. Pay-per-use: $0.02/conversion. Built on Microsoft's Markitdown (82k+ ⭐).

Pricing

Pay per event + usage

Rating

0.0

(0)

Developer

RECTOR SOL

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

Markitdown MCP Server ⚡

Convert any document to AI-ready Markdown in seconds Cloud-hosted Model Context Protocol server powered by Microsoft's Markitdown

LICENSE

🎯 What is This?

Markitdown MCP Server is a cloud-hosted service that converts documents into clean, AI-optimized Markdown. Built on Microsoft's Markitdown library (82k+ ⭐), it eliminates the need for local Python installations and provides instant, scalable document conversion through the Model Context Protocol.

Perfect for RAG pipelines, knowledge bases, AI agents, and document processing workflows.

✨ Key Features

🚀 Universal Format Support

Convert 29+ file formats to clean Markdown:

Documents: PDF, DOCX, PPTX, XLSX
Images: PNG, JPG, GIF (with OCR)
Web: HTML, XML
Audio: MP3, WAV (with transcription)
Archives: ZIP (extract and convert contents)
And many more!

☁️ Zero Setup Required

No Python installation needed
No dependency management
No local configuration
Just call the API and get Markdown

🎭 MCP Native

First-class Model Context Protocol support
Works seamlessly with Claude Desktop, Cursor, Aider
AI agents can discover and use it automatically

⚡ Lightning Fast

Direct Python library integration (no subprocess overhead)
Typical conversion: < 3 seconds
Cloud-scale infrastructure via Apify

💰 Pay-Per-Use

$0.01 per Actor start
$0.02 per document conversion
No subscriptions, no minimums

🎬 Quick Start

📖 INSTALLATION.md - Complete setup for Claude Code CLI, Claude Desktop, Cursor, VS Code, and more

Claude Code CLI (Recommended)

# Add the server with one command
claude mcp add --transport http markitdown \
  https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/mcp/latest

# Authenticate (opens browser for OAuth)
/mcp

Then in Claude Code:

Convert this PDF to markdown: https://example.com/document.pdf

Claude Desktop

macOS: Edit ~/Library/Application Support/Claude/claude_desktop_config.json Windows: Edit %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "markitdown": {
      "url": "https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/mcp/latest",
      "transport": {
        "type": "http",
        "headers": {
          "Authorization": "Bearer YOUR_APIFY_TOKEN"
        }
      }
    }
  }
}

Restart Claude Desktop and start converting!

Cursor IDE

Open Settings → MCP Servers
Click Add new MCP server
Paste configuration (see INSTALLATION.md#cursor-ide)
Enable and look for green dot ✅

Get Your Apify Token

Sign up at apify.com (free tier available)
Go to Settings → Integrations
Copy your API Token

📖 INSTALLATION.md

For Developers (API)

Direct HTTP Request

curl -X POST https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/runs \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "fileUrl": "https://example.com/document.pdf"
  }'

Python Example

from apify_client import ApifyClient

client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('rector_labs/markitdown-mcp-server').call(
    run_input={
        'fileUrl': 'https://example.com/document.pdf'
    }
)

# Get markdown output
for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item['markdown'])

JavaScript/TypeScript Example

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('rector_labs/markitdown-mcp-server').call({
  fileUrl: 'https://example.com/document.pdf'
});

// Get markdown output
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].markdown);

📚 Supported Formats

Documents & Spreadsheets

Format	Extension	Notes
PDF	`.pdf`	Text extraction, OCR support
Word	`.docx`, `.doc`	Preserves formatting
PowerPoint	`.pptx`, `.ppt`	Slide text extraction
Excel	`.xlsx`, `.xls`	Table to Markdown
CSV	`.csv`	Table formatting
TSV	`.tsv`	Table formatting

Images

Format	Extension	Notes
PNG	`.png`	OCR text extraction
JPEG	`.jpg`, `.jpeg`	OCR text extraction
GIF	`.gif`	OCR text extraction
BMP	`.bmp`	OCR text extraction

Web & Markup

Format	Extension	Notes
HTML	`.html`, `.htm`	Clean conversion
XML	`.xml`	Structured data
Markdown	`.md`	Pass-through

Audio & Video

Format	Extension	Notes
MP3	`.mp3`	Speech-to-text transcription
WAV	`.wav`	Speech-to-text transcription
YouTube	URLs	Transcript extraction

Format	Extension	Notes
ZIP	`.zip`	Extract and convert contents

💡 Use Cases

🤖 RAG Pipelines

PDF Documents → Markitdown → Clean Markdown → Vector DB → LLM

Perfect for preparing documents for semantic search and retrieval.

📖 Knowledge Base Migration

Convert legacy documentation (PDFs, Word docs) to modern Markdown format for wikis, documentation sites, or content management systems.

🎓 Research & Academia

Extract text from research papers, presentations, and datasets for analysis and processing.

📊 Data Extraction

Convert invoices, reports, and spreadsheets into structured Markdown for further processing.

🔄 Batch Processing

Process hundreds of documents in parallel using Apify's infrastructure.

🔌 Integrations

MCP Clients

Supported clients:

✅ INSTALLATION.md#claude-code-cli - Native HTTP transport with OAuth
✅ INSTALLATION.md#claude-desktop - JSON configuration
✅ INSTALLATION.md#cursor-ide - UI-based installation
✅ INSTALLATION.md#vs-code - Via MCP extensions
✅ INSTALLATION.md#other-mcp-clients - Windsurf, Zed, etc.

📖 INSTALLATION.md

Workflow Automation

n8n Workflow

Add Apify node
Select Markitdown MCP Server actor
Configure file URL input
Connect to downstream nodes

Make.com (Integromat)

Add Apify module
Select actor: rector_labs/markitdown-mcp-server
Map file URL from trigger
Use output in next steps

Zapier

Choose Apify app
Action: Run Actor
Actor: markitdown-mcp-server
Map data from previous steps

⚙️ Configuration

Input Parameters

Parameter	Type	Required	Description
`fileUrl`	string	✅ (or base64)	URL of the document to convert
`fileBase64`	string	✅ (or URL)	Base64-encoded file content

Note: Provide either fileUrl or fileBase64, not both.

Example Inputs

URL-based:

{
  "fileUrl": "https://example.com/document.pdf"
}

Base64-based:

{
  "fileBase64": "JVBERi0xLjQKJeLjz9MKMyAwIG9iago8PC..."
}

📊 Output Format

The actor outputs clean Markdown text with metadata:

{
  "event": "conversion_success",
  "file_size": 153600,
  "markdown_length": 5234,
  "file_type": ".pdf"
}

The Markdown content is returned as the tool response.

💲 Pricing

Pay-Per-Event Model

Event	Price	Description
Actor Start	$0.01	One-time fee per Actor run
Document Conversion	$0.02	Per successful conversion

Example Costs

Single document: $0.03 total ($0.01 start + $0.02 conversion)
100 documents: ~$2.10 ($0.01 start + $2.00 conversions)
1,000 documents: ~$20.10 ($0.01 start + $20.00 conversions)

No subscriptions. No minimums. Pay only for what you use.

🚀 Performance

Metric	Value
Average conversion time	< 3 seconds
Small files (< 1MB)	< 2 seconds
Large files (10MB+)	< 10 seconds
Concurrent processing	Unlimited (cloud-scaled)
Uptime	99.95% (Apify SLA)

🛠️ Advanced Features

Error Handling

The actor gracefully handles:

Invalid file URLs (404, network errors)
Unsupported file formats (clear error messages)
Corrupted files (validation before processing)
Large files (automatic timeout handling)

Logging & Debugging

All conversions are logged with:

File type and size
Conversion duration
Success/failure status
Error details (if any)

Custom Options

Coming soon:

Azure Document Intelligence integration
OpenAI image description
Custom OCR settings
Batch processing mode

🔒 Security & Privacy

No data retention: Files are processed and immediately deleted
Encrypted transport: All transfers use HTTPS
Isolated execution: Each conversion runs in a sandboxed container
No logging of content: Only metadata is logged
GDPR compliant: Hosted on Apify's secure infrastructure

❓ FAQ

Q: What's the difference between this and running Markitdown locally?

A: This is a cloud-hosted service with:

✅ No Python installation required
✅ No dependency management
✅ Automatic scaling for batch processing
✅ MCP integration for AI agents
✅ 99.95% uptime guarantee
✅ Pay-per-use (no server costs)

Q: Can I convert password-protected PDFs?

A: Not currently. Password-protected documents will return an error. Remove protection before conversion.

Q: What's the maximum file size?

A: 100 MB hard limit. Files over 50 MB may take longer to process. For larger files, consider splitting them first.

Q: Does it work with scanned PDFs (images)?

A: Yes! OCR (Optical Character Recognition) is supported for image-based PDFs and image files.

Q: Can I use this in production?

A: Absolutely! The actor runs on Apify's production infrastructure with 99.95% uptime SLA.

Q: How accurate is the Markdown output?

A: Markitdown preserves:

✅ Headings and structure
✅ Bold and italic formatting
✅ Lists (ordered and unordered)
✅ Tables
✅ Links
✅ Code blocks

Complex layouts may need manual review.

Q: Can I convert multiple files at once?

A: Yes! Run multiple Actor instances in parallel, or use batch mode (contact for enterprise pricing).

🐛 Troubleshooting

"File download failed: HTTP 404"

Cause: The URL is invalid or the file doesn't exist.

Solution:

Verify the URL is correct and publicly accessible
Ensure the file hasn't been deleted or moved
Check for authentication requirements

"Unsupported file format"

Cause: The file extension is not in the supported formats list.

Solution:

Check the Supported Formats section
Convert the file to a supported format first
Contact support if you need a specific format added

"Conversion timeout"

Cause: The file is too large or complex.

Solution:

Split large files into smaller chunks
Simplify complex documents
Increase timeout (contact support for enterprise plans)

"Invalid base64 content"

Cause: The base64 string is malformed or incomplete.

Solution:

Verify base64 encoding is correct
Ensure no truncation occurred during transfer
Use fileUrl instead if possible

📖 Documentation

MCP Protocol: modelcontextprotocol.io
Microsoft Markitdown: github.com/microsoft/markitdown
Apify Platform: docs.apify.com
Python SDK: docs.apify.com/sdk/python

🤝 Support

Need Help?

📧 Email: support@apify.com
💬 Discord: apify.com/discord
📚 Documentation: docs.apify.com
🐛 Bug Reports: GitHub Issues

Community

⭐ Star on GitHub: RECTOR-LABS/markitdown-mcp-server
🐦 Follow Updates: @apify
💡 Feature Requests: Open a GitHub issue

🚀 Get Started Now

Deploy to Apify

Log in to Apify

$apify login

Deploy the Actor

$apify push

Enable Standby Mode

Go to Actor settings and enable standby mode.

Get Your Actor URL

Your MCP endpoint will be: https://rector-labs--markitdown-mcp-server.apify.actor/mcp

Connect AI Agents

Add the endpoint to Claude Desktop, Cursor, or your favorite MCP client!

📜 License

This project is built on:

Microsoft Markitdown: MIT License
Apify SDK: Apache 2.0 License
MCP SDK: MIT License

Actor code: MIT License

🙏 Credits

Built with:

Microsoft Markitdown - Document conversion library (82k+ ⭐)
Apify Platform - Serverless cloud infrastructure
MCP Protocol - AI agent integration standard

Made with ❤️ for the AI developer community

Markitdown MCP Actor

amaranth_nylon/Markitdown-MCP-actor

Markitdown MCP Actor is an Apify Actor designed to convert various file formats (like PDFs, DOCX, PPTX, HTML, or images) into clean Markdown (.md) text.

Yash Kavaiya

S3 to Markdown

consummate_hickory/s3FileToMarkdown

Transform S3 documents into perfect AI training data! Converts PDFs, Word, Excel, images, audio to clean Markdown that LLMs love. Uses Microsoft's markitdown engine. Ideal for RAG systems, AI agents, and machine learning pipelines.

Lorenzo Dalmazzo

5.0

Excel Mcp Server

lovely_radiologist/excel-mcp-server

Vivek Gaur

Mindmap MCP Server

agentify/mindmap-mcp-server

A Model Context Protocol (MCP) server for converting Markdown content to interactive mindmaps.

agentify

Mcp Server Generator

fiery_dream/mcp-server-generator

Cody Churchwell

Pandoc Universal Mcp

whitewalk/pandoc-universal-mcp

Convert documents between 40+ formats via MCP. Markdown, DOCX, PDF, HTML, LaTeX, EPUB, PPTX & more. Academic support with citations, bibliography & math. Batch conversion. Perfect for AI agents & Claude Desktop integration.

seena Singh

Web Search MCP Server

abotapi/ai-search-mcp-server

An Apify MCP Server that provides real-time web search capabilities for AI agents via the Model Context Protocol (MCP).

AbotAPI

Microsoft Learn MCP Server

agentify/microsoft-learn-mcp-server

The Microsoft Learn MCP Server enables AI clients to access trusted and up-to-date information directly from Microsoft's official documentation. It provides semantic search and document retrieval capabilities from Microsoft Learn.

agentify

Mcp Gateway

maximus242/mcp-gateway

Convert any REST API into an MCP server for AI agents

Philip

Real Estate MCP Server

nexgendata/real-estate-mcp-server

MCP server providing AI agents with real estate data tools. Search Redfin properties, get property details, and find real estate agents via Google Maps. Pay-per-event pricing at $0.01 per tool call.

Stephan Corbeil

Markitdown Mcp Server

Markitdown MCP Server ⚡

🎯 What is This?

✨ Key Features

🚀 Universal Format Support

☁️ Zero Setup Required

🎭 MCP Native

⚡ Lightning Fast

💰 Pay-Per-Use

🎬 Quick Start

Claude Code CLI (Recommended)

Claude Desktop

Cursor IDE

Get Your Apify Token

For Developers (API)

Direct HTTP Request

Python Example

JavaScript/TypeScript Example

📚 Supported Formats

Documents & Spreadsheets

Images

Web & Markup

Audio & Video

Archives

💡 Use Cases

🤖 RAG Pipelines

📖 Knowledge Base Migration

🎓 Research & Academia

📊 Data Extraction

🔄 Batch Processing

🔌 Integrations

MCP Clients

Workflow Automation

n8n Workflow

Make.com (Integromat)

Zapier

⚙️ Configuration

Input Parameters

Example Inputs

📊 Output Format

💲 Pricing

Pay-Per-Event Model

Example Costs

🚀 Performance

🛠️ Advanced Features

Error Handling

Logging & Debugging

Custom Options

🔒 Security & Privacy

❓ FAQ

Q: What's the difference between this and running Markitdown locally?

Q: Can I convert password-protected PDFs?

Q: What's the maximum file size?

Q: Does it work with scanned PDFs (images)?

Q: Can I use this in production?

Q: How accurate is the Markdown output?

Q: Can I convert multiple files at once?

🐛 Troubleshooting

"File download failed: HTTP 404"

"Unsupported file format"

"Conversion timeout"

"Invalid base64 content"

📖 Documentation

🤝 Support

Need Help?

Community

🚀 Get Started Now

Deploy to Apify

📜 License

🙏 Credits

You might also like

Markitdown MCP Actor

S3 to Markdown

Excel Mcp Server

Mindmap MCP Server

Mcp Server Generator

Pandoc Universal Mcp

Web Search MCP Server

Microsoft Learn MCP Server

Mcp Gateway