Markitdown Mcp Server
Pricing
Pay per event
Markitdown Mcp Server
Cloud-hosted MCP server converting 29+ document formats (PDF, DOCX, PPTX, images, audio) to AI-ready Markdown. Zero Python setup. Perfect for RAG pipelines and AI agents. Pay-per-use: $0.02/conversion. Built on Microsoft's Markitdown (82k+ β).
Pricing
Pay per event
Rating
0.0
(0)
Developer

RECTOR SOL
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
23 days ago
Last modified
Categories
Share
Markitdown MCP Server β‘
Convert any document to AI-ready Markdown in seconds Cloud-hosted Model Context Protocol server powered by Microsoft's Markitdown
π― What is This?
Markitdown MCP Server is a cloud-hosted service that converts documents into clean, AI-optimized Markdown. Built on Microsoft's Markitdown library (82k+ β), it eliminates the need for local Python installations and provides instant, scalable document conversion through the Model Context Protocol.
Perfect for RAG pipelines, knowledge bases, AI agents, and document processing workflows.
β¨ Key Features
π Universal Format Support
Convert 29+ file formats to clean Markdown:
- Documents: PDF, DOCX, PPTX, XLSX
- Images: PNG, JPG, GIF (with OCR)
- Web: HTML, XML
- Audio: MP3, WAV (with transcription)
- Archives: ZIP (extract and convert contents)
- And many more!
βοΈ Zero Setup Required
- No Python installation needed
- No dependency management
- No local configuration
- Just call the API and get Markdown
π MCP Native
- First-class Model Context Protocol support
- Works seamlessly with Claude Desktop, Cursor, Aider
- AI agents can discover and use it automatically
β‘ Lightning Fast
- Direct Python library integration (no subprocess overhead)
- Typical conversion: < 3 seconds
- Cloud-scale infrastructure via Apify
π° Pay-Per-Use
- $0.01 per Actor start
- $0.02 per document conversion
- No subscriptions, no minimums
π¬ Quick Start
π INSTALLATION.md - Complete setup for Claude Code CLI, Claude Desktop, Cursor, VS Code, and more
Claude Code CLI (Recommended)
# Add the server with one commandclaude mcp add --transport http markitdown \https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/mcp/latest# Authenticate (opens browser for OAuth)/mcp
Then in Claude Code:
Convert this PDF to markdown: https://example.com/document.pdf
Claude Desktop
macOS: Edit ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: Edit %APPDATA%\Claude\claude_desktop_config.json
{"mcpServers": {"markitdown": {"url": "https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/mcp/latest","transport": {"type": "http","headers": {"Authorization": "Bearer YOUR_APIFY_TOKEN"}}}}}
Restart Claude Desktop and start converting!
Cursor IDE
- Open Settings β MCP Servers
- Click Add new MCP server
- Paste configuration (see INSTALLATION.md#cursor-ide)
- Enable and look for green dot β
Get Your Apify Token
- Sign up at apify.com (free tier available)
- Go to Settings β Integrations
- Copy your API Token
π INSTALLATION.md
For Developers (API)
Direct HTTP Request
curl -X POST https://api.apify.com/v2/acts/rector_labs~markitdown-mcp-server/runs \-H "Authorization: Bearer YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"fileUrl": "https://example.com/document.pdf"}'
Python Example
from apify_client import ApifyClientclient = ApifyClient('YOUR_API_TOKEN')run = client.actor('rector_labs/markitdown-mcp-server').call(run_input={'fileUrl': 'https://example.com/document.pdf'})# Get markdown outputfor item in client.dataset(run['defaultDatasetId']).iterate_items():print(item['markdown'])
JavaScript/TypeScript Example
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('rector_labs/markitdown-mcp-server').call({fileUrl: 'https://example.com/document.pdf'});// Get markdown outputconst { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].markdown);
π Supported Formats
Documents & Spreadsheets
| Format | Extension | Notes |
|---|---|---|
.pdf | Text extraction, OCR support | |
| Word | .docx, .doc | Preserves formatting |
| PowerPoint | .pptx, .ppt | Slide text extraction |
| Excel | .xlsx, .xls | Table to Markdown |
| CSV | .csv | Table formatting |
| TSV | .tsv | Table formatting |
Images
| Format | Extension | Notes |
|---|---|---|
| PNG | .png | OCR text extraction |
| JPEG | .jpg, .jpeg | OCR text extraction |
| GIF | .gif | OCR text extraction |
| BMP | .bmp | OCR text extraction |
Web & Markup
| Format | Extension | Notes |
|---|---|---|
| HTML | .html, .htm | Clean conversion |
| XML | .xml | Structured data |
| Markdown | .md | Pass-through |
Audio & Video
| Format | Extension | Notes |
|---|---|---|
| MP3 | .mp3 | Speech-to-text transcription |
| WAV | .wav | Speech-to-text transcription |
| YouTube | URLs | Transcript extraction |
Archives
| Format | Extension | Notes |
|---|---|---|
| ZIP | .zip | Extract and convert contents |
π‘ Use Cases
π€ RAG Pipelines
PDF Documents β Markitdown β Clean Markdown β Vector DB β LLM
Perfect for preparing documents for semantic search and retrieval.
π Knowledge Base Migration
Convert legacy documentation (PDFs, Word docs) to modern Markdown format for wikis, documentation sites, or content management systems.
π Research & Academia
Extract text from research papers, presentations, and datasets for analysis and processing.
π Data Extraction
Convert invoices, reports, and spreadsheets into structured Markdown for further processing.
π Batch Processing
Process hundreds of documents in parallel using Apify's infrastructure.
π Integrations
MCP Clients
Supported clients:
- β INSTALLATION.md#claude-code-cli - Native HTTP transport with OAuth
- β INSTALLATION.md#claude-desktop - JSON configuration
- β INSTALLATION.md#cursor-ide - UI-based installation
- β INSTALLATION.md#vs-code - Via MCP extensions
- β INSTALLATION.md#other-mcp-clients - Windsurf, Zed, etc.
π INSTALLATION.md
Workflow Automation
n8n Workflow
- Add Apify node
- Select Markitdown MCP Server actor
- Configure file URL input
- Connect to downstream nodes
Make.com (Integromat)
- Add Apify module
- Select actor:
rector_labs/markitdown-mcp-server - Map file URL from trigger
- Use output in next steps
Zapier
- Choose Apify app
- Action: Run Actor
- Actor:
markitdown-mcp-server - Map data from previous steps
βοΈ Configuration
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
fileUrl | string | β (or base64) | URL of the document to convert |
fileBase64 | string | β (or URL) | Base64-encoded file content |
Note: Provide either fileUrl or fileBase64, not both.
Example Inputs
URL-based:
{"fileUrl": "https://example.com/document.pdf"}
Base64-based:
{"fileBase64": "JVBERi0xLjQKJeLjz9MKMyAwIG9iago8PC..."}
π Output Format
The actor outputs clean Markdown text with metadata:
{"event": "conversion_success","file_size": 153600,"markdown_length": 5234,"file_type": ".pdf"}
The Markdown content is returned as the tool response.
π² Pricing
Pay-Per-Event Model
| Event | Price | Description |
|---|---|---|
| Actor Start | $0.01 | One-time fee per Actor run |
| Document Conversion | $0.02 | Per successful conversion |
Example Costs
- Single document: $0.03 total ($0.01 start + $0.02 conversion)
- 100 documents: ~$2.10 ($0.01 start + $2.00 conversions)
- 1,000 documents: ~$20.10 ($0.01 start + $20.00 conversions)
No subscriptions. No minimums. Pay only for what you use.
π Performance
| Metric | Value |
|---|---|
| Average conversion time | < 3 seconds |
| Small files (< 1MB) | < 2 seconds |
| Large files (10MB+) | < 10 seconds |
| Concurrent processing | Unlimited (cloud-scaled) |
| Uptime | 99.95% (Apify SLA) |
π οΈ Advanced Features
Error Handling
The actor gracefully handles:
- Invalid file URLs (404, network errors)
- Unsupported file formats (clear error messages)
- Corrupted files (validation before processing)
- Large files (automatic timeout handling)
Logging & Debugging
All conversions are logged with:
- File type and size
- Conversion duration
- Success/failure status
- Error details (if any)
Custom Options
Coming soon:
- Azure Document Intelligence integration
- OpenAI image description
- Custom OCR settings
- Batch processing mode
π Security & Privacy
- No data retention: Files are processed and immediately deleted
- Encrypted transport: All transfers use HTTPS
- Isolated execution: Each conversion runs in a sandboxed container
- No logging of content: Only metadata is logged
- GDPR compliant: Hosted on Apify's secure infrastructure
β FAQ
Q: What's the difference between this and running Markitdown locally?
A: This is a cloud-hosted service with:
- β No Python installation required
- β No dependency management
- β Automatic scaling for batch processing
- β MCP integration for AI agents
- β 99.95% uptime guarantee
- β Pay-per-use (no server costs)
Q: Can I convert password-protected PDFs?
A: Not currently. Password-protected documents will return an error. Remove protection before conversion.
Q: What's the maximum file size?
A: 100 MB hard limit. Files over 50 MB may take longer to process. For larger files, consider splitting them first.
Q: Does it work with scanned PDFs (images)?
A: Yes! OCR (Optical Character Recognition) is supported for image-based PDFs and image files.
Q: Can I use this in production?
A: Absolutely! The actor runs on Apify's production infrastructure with 99.95% uptime SLA.
Q: How accurate is the Markdown output?
A: Markitdown preserves:
- β Headings and structure
- β Bold and italic formatting
- β Lists (ordered and unordered)
- β Tables
- β Links
- β Code blocks
Complex layouts may need manual review.
Q: Can I convert multiple files at once?
A: Yes! Run multiple Actor instances in parallel, or use batch mode (contact for enterprise pricing).
π Troubleshooting
"File download failed: HTTP 404"
Cause: The URL is invalid or the file doesn't exist.
Solution:
- Verify the URL is correct and publicly accessible
- Ensure the file hasn't been deleted or moved
- Check for authentication requirements
"Unsupported file format"
Cause: The file extension is not in the supported formats list.
Solution:
- Check the Supported Formats section
- Convert the file to a supported format first
- Contact support if you need a specific format added
"Conversion timeout"
Cause: The file is too large or complex.
Solution:
- Split large files into smaller chunks
- Simplify complex documents
- Increase timeout (contact support for enterprise plans)
"Invalid base64 content"
Cause: The base64 string is malformed or incomplete.
Solution:
- Verify base64 encoding is correct
- Ensure no truncation occurred during transfer
- Use
fileUrlinstead if possible
π Documentation
- MCP Protocol: modelcontextprotocol.io
- Microsoft Markitdown: github.com/microsoft/markitdown
- Apify Platform: docs.apify.com
- Python SDK: docs.apify.com/sdk/python
π€ Support
Need Help?
- π§ Email: support@apify.com
- π¬ Discord: apify.com/discord
- π Documentation: docs.apify.com
- π Bug Reports: GitHub Issues
Community
- β Star on GitHub: RECTOR-LABS/markitdown-mcp-server
- π¦ Follow Updates: @apify
- π‘ Feature Requests: Open a GitHub issue
π Get Started Now
Deploy to Apify
- Log in to Apify
$apify login
- Deploy the Actor
$apify push
- Enable Standby Mode
Go to Actor settings and enable standby mode.
- Get Your Actor URL
Your MCP endpoint will be: https://rector-labs--markitdown-mcp-server.apify.actor/mcp
- Connect AI Agents
Add the endpoint to Claude Desktop, Cursor, or your favorite MCP client!
π License
This project is built on:
- Microsoft Markitdown: MIT License
- Apify SDK: Apache 2.0 License
- MCP SDK: MIT License
Actor code: MIT License
π Credits
Built with:
- Microsoft Markitdown - Document conversion library (82k+ β)
- Apify Platform - Serverless cloud infrastructure
- MCP Protocol - AI agent integration standard
Made with β€οΈ for the AI developer community

