MIME Type Detector
Pricing
Pay per event
MIME Type Detector
Detect MIME types from file extensions, URLs, or magic bytes (base64). Batch process thousands of files. Uses mime-types + file-type packages. Zero proxy, pure utility.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
Detect MIME types from file extensions, URLs, or raw file bytes (base64-encoded magic bytes). Supports batch processing — analyze thousands of files in a single run.
Uses the mime-types npm package for extension-based lookup and file-type for magic-byte detection. Zero proxy, no HTTP requests — pure utility.
What does MIME Type Detector do?
MIME Type Detector takes a list of filenames, URLs, or base64-encoded file content and returns the correct MIME type for each input. It uses three detection strategies in order of confidence:
- Magic bytes (base64 content) — reads the file's binary signature for definitive identification (e.g., PDF files always start with
%PDF-) - Filename extension — maps extensions like
.pdf,.xlsx,.mp4to their MIME types using themime-typesdatabase - URL path — extracts the file extension from the URL pathname for fast lookup
Each result includes the MIME type, character set (for text formats), canonical extension, detection method, and confidence level.
Who is MIME Type Detector for?
Backend developers building file upload services — validate that uploaded files match their declared content type before processing or storing them.
- Verify file types without running a browser or calling an external API
- Batch-validate hundreds of file names from a database in one run
- Cross-check user-provided content types against actual file signatures
Data pipeline engineers processing mixed media — when pulling files from S3, FTP, or crawl datasets, MIME types determine how to route, transform, or index each file.
- Classify thousands of file records from a manifest or URL list
- Route images, documents, and videos to different processing queues
- Generate accurate
Content-Typeheaders for re-serving files
DevOps and infosec teams auditing file uploads — identify files that have been renamed to evade extension-based filters by checking their actual magic bytes.
- Detect
.exefiles disguised as.txtor.jpg - Audit S3 buckets or CDN origins for mismatched content types
- Integrate MIME verification into CI/CD pipelines via the Apify API
Automation builders and no-code users — use MIME detection as a step in larger workflows without writing any code.
- Connect to Zapier or Make to add MIME detection to file processing automations
- Export results to Google Sheets for team review
- Schedule recurring audits of file repositories
Why use MIME Type Detector?
- 🔍 Three detection strategies — magic bytes are the gold standard; extension fallback handles the common case; URL extraction works for cloud storage links
- ⚡ Extremely fast — pure CPU, no network requests. Thousands of items process in seconds
- 📦 Batch processing — submit up to thousands of items in one run
- 💯 High accuracy for common formats — PDF, JPEG, PNG, GIF, WebP, MP4, ZIP, DOCX, XLSX, and 700+ other formats supported by the
file-typelibrary - 🔧 Structured output — every result includes method, confidence, charset, and extension — not just the MIME string
- 💰 Pay-per-item pricing — no monthly subscription; you pay only for items analyzed
- 🔌 Apify API + scheduling — automate MIME checks on a schedule, trigger via webhook, or integrate into any tech stack
- 🤖 MCP-ready — use directly from Claude Code, Claude Desktop, or any MCP-enabled AI agent
What data can you extract?
Each result contains the following fields:
| Field | Type | Description |
|---|---|---|
input | string | The original input value (filename, URL, or truncated base64 label) |
mimeType | string | null | The detected MIME type, e.g. application/pdf |
charset | string | null | Character set for text formats (e.g. UTF-8 for text/html) |
extension | string | null | Canonical file extension including the dot (e.g. .pdf) |
method | string | Detection method: magic-bytes, extension, url, or unknown |
confidence | string | Confidence level: high (magic bytes), medium (extension/URL), low (unknown) |
error | string | null | Error message if detection failed, null otherwise |
Supported input types:
| Input type | Field | Example |
|---|---|---|
| Filename | filename | "report.pdf", "photo.JPEG", "archive.tar.gz" |
| URL | url | "https://cdn.example.com/video.mp4" |
| Base64 content | base64Content | First 4 KB of any file encoded as base64 |
How much does it cost to detect MIME types?
This Actor uses pay-per-event pricing — you pay only for what you detect. No monthly subscription. All platform costs are included.
| Free | Starter ($29/mo) | Scale ($199/mo) | Business ($999/mo) | |
|---|---|---|---|---|
| Per detection | $0.00115 | $0.001 | $0.00078 | $0.0006 |
| 1,000 detections | $1.15 | $1.00 | $0.78 | $0.60 |
| 10,000 detections | $11.50 | $10.00 | $7.80 | $6.00 |
Plus a one-time start fee of $0.005 per run (same across all tiers).
Real-world cost examples:
| Input | Items | Duration | Cost (Free tier) |
|---|---|---|---|
| 10 filenames | 10 | < 1s | ~$0.017 |
| 100 URLs | 100 | < 1s | ~$0.121 |
| 1,000 mixed items | 1,000 | ~2s | ~$1.16 |
| 10,000 filenames | 10,000 | ~5s | ~$11.51 |
With the free $5 credit Apify gives every new account, you can detect over 4,000 MIME types for free.
How to detect MIME types with this Actor
- Open MIME Type Detector on Apify Store
- Click Try for free
- In the Items to detect field, enter your list of filenames, URLs, or base64 content
- Click Start — the run completes in seconds
- View results in the Dataset tab, or export to JSON/CSV/Excel
Example: detect from filenames only
{"items": [{ "filename": "invoice.pdf" },{ "filename": "product_photo.jpg" },{ "filename": "data_export.csv" },{ "filename": "backup.tar.gz" }]}
Example: detect from URLs
{"items": [{ "url": "https://cdn.example.com/assets/logo.svg" },{ "url": "https://files.example.com/report_2024.xlsx" },{ "url": "https://storage.googleapis.com/bucket/video.webm" }]}
Example: magic-byte detection from base64 content
{"items": [{ "base64Content": "JVBERi0xLjQKJeLjz9MKCg==" },{ "base64Content": "iVBORw0KGgoAAAANSUhEUgA=" },{ "filename": "unknown.bin", "base64Content": "UEsDBBQAAAAIA..." }]}
When both filename and base64Content are provided, magic-byte detection runs first — the filename is used as a label if magic-byte detection fails.
Input parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
items | array | Yes | List of items to analyze. Each item must have at least one of: filename, url, base64Content |
items[].filename | string | No | A filename (e.g. photo.jpg) or just the extension (e.g. .pdf) |
items[].url | string | No | A full URL — the path component is parsed for the file extension |
items[].base64Content | string | No | Base64-encoded file bytes. The first 4,100 bytes are sufficient for magic-byte detection |
Tips for inputs:
- Filenames are case-insensitive:
.JPEG,.jpeg,.Jpegall resolve toimage/jpeg - URLs with query parameters (e.g.
?token=abc) are parsed correctly — only the pathname is used - When providing base64 content, you don't need the full file — the first 512 bytes are enough for most formats. Use 4,100 bytes to cover all formats supported by
file-type - You can mix all three input types in the same batch
Output examples
Successful detection from filename:
{"input": "invoice.pdf","mimeType": "application/pdf","charset": null,"extension": ".pdf","method": "extension","confidence": "medium","error": null}
Magic-byte detection (highest confidence):
{"input": "base64:JVBERi0xLjQKJe...","mimeType": "application/pdf","charset": null,"extension": ".pdf","method": "magic-bytes","confidence": "high","error": null}
Text format with charset:
{"input": "styles.css","mimeType": "text/css","charset": "UTF-8","extension": ".css","method": "extension","confidence": "medium","error": null}
Unknown format (fallback):
{"input": "unknown_file","mimeType": "application/octet-stream","charset": null,"extension": null,"method": "unknown","confidence": "low","error": null}
Malformed item (error):
{"input": "(empty)","mimeType": null,"charset": null,"extension": null,"method": "unknown","confidence": "low","error": "Each item must have at least one of: filename, url, base64Content"}
Tips for best results
- 🎯 Use magic bytes for security-critical checks — extension-based detection can be spoofed by renaming files. Always use
base64Contentwhen verifying file uploads in security-sensitive contexts - 📉 Keep base64 content short — you only need the first 4,100 bytes for reliable detection. Sending full file contents wastes bandwidth and makes no difference to accuracy
- 🚀 Batch all items in one run — the actor is optimized for bulk processing. Submitting 10,000 items in one run is more efficient than 10,000 separate runs
- 📊 Use the
methodfield to filter results — if you need only high-confidence detections, filter formethod === "magic-bytes". For general use,extensionresults are reliable for well-known formats - ⚠️
application/octet-streammeans unknown — this is the RFC 2046 fallback when no MIME type could be determined. Check themethodfield:unknownmeans no detection succeeded - 🔗 URLs must have a file extension in the path — URLs like
https://api.example.com/files/12345(no extension) cannot be detected by extension lookup. Providebase64Contentinstead - 📋 Start small — test with a handful of items first to verify the results match your expectations before submitting large batches
Integrations
MIME Type Detector → Google Sheets — export results to Google Sheets for team review of a file repository audit. Use Apify's native Google Sheets integration or the Google Sheets API Actor.
MIME Type Detector → Make (Integromat) — trigger MIME detection in a Make scenario when new files arrive in Dropbox, S3, or Google Drive. Route files to different processing flows based on the detected MIME type.
MIME Type Detector → Zapier — chain with Zapier's file processing actions to validate uploads before storing them in Airtable, Notion, or your CRM.
Scheduled audit — set a daily or weekly schedule to run MIME detection on a list of URLs from your CDN or file storage. Get alerts when file types change unexpectedly.
Webhook-triggered validation — call the Actor via Apify API webhook whenever a new file is uploaded to your system. Return the MIME type to your backend for routing decisions without running a Node.js process yourself.
CI/CD pipeline integration — call the actor from a GitHub Actions workflow or Jenkins job to validate that build artifacts have the expected MIME types before deployment.
Using the Apify API
Run MIME Type Detector programmatically from any language using the Apify API.
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('automation-lab/mime-type-detector').call({items: [{ filename: 'document.pdf' },{ filename: 'photo.jpg' },{ url: 'https://cdn.example.com/video.mp4' },{ base64Content: 'JVBERi0xLjQKJeLjz9MKCg==' },],});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Python
from apify_client import ApifyClientclient = ApifyClient(token='YOUR_APIFY_TOKEN')run = client.actor('automation-lab/mime-type-detector').call(run_input={'items': [{'filename': 'document.pdf'},{'filename': 'photo.jpg'},{'url': 'https://cdn.example.com/video.mp4'},{'base64Content': 'JVBERi0xLjQKJeLjz9MKCg=='},],})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items)
cURL
curl -X POST "https://api.apify.com/v2/acts/automation-lab~mime-type-detector/runs?token=YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"items": [{"filename": "document.pdf"},{"filename": "photo.jpg"},{"url": "https://cdn.example.com/video.mp4"}]}'
Use with AI agents via MCP
MIME Type Detector is available as a tool for AI assistants that support the Model Context Protocol (MCP).
Add the Apify MCP server to your AI client — this gives you access to all Apify actors, including this one:
Setup for Claude Code
$claude mcp add --transport http apify "https://mcp.apify.com"
Setup for Claude Desktop, Cursor, or VS Code
Add this to your MCP config file:
{"mcpServers": {"apify": {"url": "https://mcp.apify.com"}}}
Your AI assistant will use OAuth to authenticate with your Apify account on first use.
Example prompts
Once connected, try asking your AI assistant:
- "Use automation-lab/mime-type-detector to detect the MIME types of: document.pdf, photo.jpg, archive.tar.gz, and script.js"
- "I have a list of CDN URLs from our S3 bucket — use the MIME Type Detector to classify each file by type"
- "Check what MIME type this base64-encoded file header corresponds to: JVBERi0xLjQKJeLjz9MKCg=="
Learn more in the Apify MCP documentation.
Is it legal to use MIME Type Detector?
Yes. This actor performs no web scraping and makes no HTTP requests to third-party websites. It processes only the data you supply — filenames, URL strings, and base64 content — entirely locally. There are no terms of service concerns, no robots.txt considerations, and no privacy implications unless you supply personally identifiable filenames.
MIME type detection is a standard software operation, equivalent to running the file command on Linux or calling URLSession.mimeType in iOS. Use it responsibly as part of lawful file processing pipelines.
FAQ
What MIME types does it support?
Extension-based detection uses the mime-types package, which covers 750+ MIME types including all common document, image, video, audio, archive, and code formats. Magic-byte detection uses the file-type package, which supports 150+ binary formats including PDF, JPEG, PNG, GIF, WebP, HEIC, MP4, ZIP, RAR, DOCX, XLSX, and more.
How many items can I process in one run?
There is no hard limit set by the actor. In practice, runs with tens of thousands of items complete in under a minute. Very large batches (100K+ items) will take a few minutes but run fine within the 60-second default timeout — increase timeoutSecs in the input if processing very large batches.
How much does it cost per item? On the Free plan, each detection costs $0.00115 plus a $0.005 start fee per run. For 1,000 items that's about $1.16. Paid plans (Starter, Scale, Business) offer significant discounts — see the pricing table above.
Is extension-based detection reliable?
For well-known formats (PDF, JPEG, MP4, DOCX, ZIP), yes — extensions are standardized and the mime-types database is comprehensive. For security-sensitive use cases (e.g., blocking malicious uploads), always use magic-byte detection via base64Content because extensions can be renamed.
Why does my URL return application/octet-stream?
URLs that don't have a file extension in the path (e.g., https://api.example.com/files/12345) can't be detected by extension lookup. Provide the file's bytes as base64Content for reliable detection.
Why does the actor return method: "unknown" for some filenames?
Files with no extension (e.g., Makefile, .gitignore, README) or unrecognized extensions don't have a MIME type in the database. The actor returns application/octet-stream as the RFC 2046 safe default.
Can it detect files disguised with wrong extensions?
Yes — if you provide base64Content, the actor checks magic bytes first regardless of the filename. A .txt file that is actually a PDF will be detected as application/pdf with confidence: "high".
Other utility actors
Looking for more file and data processing tools? Check out these automation-lab actors:
- Color Contrast Checker (WCAG) — validate color pairs against WCAG 2.1 AA/AAA accessibility standards
- JSON Schema Generator — generate JSON Schema from sample JSON data
- Lazada Scraper — extract product listings and reviews from Lazada

