ZIP Extractor avatar
ZIP Extractor
Under maintenance

Pricing

from $8.00 / 1,000 results

Go to Apify Store
ZIP Extractor

ZIP Extractor

Under maintenance

Upload any file type or provide a URL, automatically extract archives (ZIP, RAR, TAR, 7Z), categorize files intelligently, and get structured output in milliseconds with download links. Supports flexible storage policies (permanent or expiry-based) with automatic cleanup.

Pricing

from $8.00 / 1,000 results

Rating

0.0

(0)

Developer

mikolabs

mikolabs

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Zip Extractor Actor (Apify)

LICENSE


A comprehensive Apify Actor that accepts any file type (uploaded or via URL), automatically extracts archives, categorizes files intelligently, and provides structured output with download and review capabilities.

🌟 Key Features

  • Universal File Support: Upload or provide URLs for any file type
  • Smart Archive Extraction: Automatically detects and extracts ZIP, RAR, TAR, 7Z, GZIP, BZIP2, XZ archives
  • Intelligent Categorization: Files are grouped into categories (images, videos, documents, data, code, etc.)
  • Flexible Storage Policies: Choose permanent storage or set expiration times (TTL or absolute date)
  • Public Download Links: Every file gets a direct download URL
  • Preview Generation: Automatic thumbnail generation for images
  • Structured Output: JSON output grouped by file type for easy UI integration
  • Cleanup Automation: Scheduled cleanup Actor to delete expired files
  • Size Protection: Configurable limits to prevent storage abuse

πŸ“¦ Supported Archive Formats

FormatExtensionsNotes
ZIP.zipFull support (built-in)
TAR.tar, .tar.gz, .tgzFull support (built-in)
GZIP.gzSingle file compression
BZIP2.bz2Single file compression
XZ.xzSingle file compression
RAR.rarRequires unrar (included in Docker)
7-Zip.7zFull support via py7zr

πŸ“‚ File Categories

Files are automatically categorized into:

  • Images: JPG, PNG, GIF, WebP, SVG, BMP, TIFF, ICO
  • Videos: MP4, AVI, MKV, MOV, WMV, FLV, WebM
  • Audio: MP3, WAV, OGG, FLAC, AAC, M4A
  • Documents: PDF, DOC, DOCX, XLS, XLSX, PPT, PPTX, TXT, RTF
  • Data: CSV, JSON, XML, YAML
  • Archives: ZIP, RAR, 7Z, TAR, GZ, BZ2, XZ
  • Code: PY, JS, Java, C, C++, HTML, CSS, SH
  • Other: Everything else

πŸš€ How to Use

Option 1: Upload File via Apify Console

  1. Go to the Actor in Apify Console
  2. Click "Upload file" in the input section
  3. Select your file (it will be stored as key INPUT in the input KVS)
  4. Configure storage policy and other options
  5. Click "Start"

Option 2: Provide a Public URL

  1. Enter the file URL in the fileUrl field
  2. Configure storage policy and other options
  3. Click "Start"

Option 3: Via API

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('YOUR_ACTOR_ID').call({
fileUrl: 'https://example.com/archive.zip',
extractArchives: true,
storePolicy: 'expire',
retainSeconds: 604800, // 7 days
generatePreviews: true,
maxFileSize: 500,
maxTotalSize: 2048
});
const { defaultDatasetId } = run;
const { items } = await client.dataset(defaultDatasetId).listItems();
console.log('Extracted files:', items);

πŸ“₯ Input Parameters

ParameterTypeRequiredDefaultDescription
fileUrlstringNo-Public URL to download the file
uploadedFileKeystringNoINPUTKey for file uploaded to input KVS
extractArchivesbooleanNotrueAuto-extract archive files
storePolicyenumNoexpireStorage policy: permanent or expire
retainSecondsintegerNo604800TTL in seconds (7 days default)
expireAtstringNo-Absolute expiration (ISO datetime)
generatePreviewsbooleanNotrueGenerate image thumbnails
maxFileSizeintegerNo500Max file size in MB
maxTotalSizeintegerNo2048Max total extraction size in MB

πŸ“€ Output Structure

The Actor produces a structured JSON output saved to the Key-Value Store under the OUTPUT key:

{
"runId": "abc123xyz",
"totalFiles": 25,
"totalSize": 12345678,
"storagePolicy": "expire",
"expiresAt": "2026-02-01T12:00:00Z",
"grouped": {
"images": [
{
"filename": "photo1.jpg",
"size": 123456,
"downloadUrl": "https://api.apify.com/v2/key-value-stores/.../records/...",
"previewUrl": "https://api.apify.com/v2/key-value-stores/.../records/...",
"expiresAt": "2026-02-01T12:00:00Z"
}
],
"videos": [...],
"documents": [...],
"data": [...],
"code": [...],
"other": [...]
},
"archives": [
{
"filename": "photos.zip",
"extractedFiles": 15
}
]
}

Dataset Records

Each extracted file also has a detailed record in the Dataset:

{
"runId": "abc123xyz",
"filename": "photo.jpg",
"relativePath": "photos/photo.jpg",
"originalArchive": "photos.zip",
"kvsKey": "files/abc123xyz/photos/photo.jpg",
"category": "images",
"mimeType": "image/jpeg",
"size": 123456,
"downloadUrl": "https://api.apify.com/v2/...",
"isPermanent": false,
"expiresAt": "2026-02-01T12:00:00Z",
"savedAt": "2026-01-24T18:01:52Z",
"hasPreview": true,
"previewUrl": "https://api.apify.com/v2/..."
}

πŸ—‘οΈ Cleanup Actor

To automatically delete expired files, deploy the cleanup Actor and schedule it:

Deploy Cleanup Actor

  1. Create a new Actor in Apify Console
  2. Copy the contents of src/cleanup.py to the new Actor's main.py
  3. Use the same Dockerfile and requirements.txt
  4. Deploy the Actor

Schedule Cleanup

  1. Go to the cleanup Actor's page
  2. Click "Schedules" β†’ "Create new"
  3. Set frequency (e.g., daily at 2 AM)
  4. The cleanup Actor will scan for expired files and delete them

Cleanup Output

{
"cleanupRun": "2026-01-25T02:00:00Z",
"totalRecords": 150,
"deleted": 23,
"skipped": 125,
"errors": 2
}

πŸ’‘ Use Cases

1. Media Library Management

Upload ZIP archives of photos/videos, get organized output with previews and download links.

2. Document Processing

Extract and categorize documents from archives for further processing.

3. Data Pipeline

Download data files (CSV, JSON) from URLs, extract if archived, and feed to other Actors.

4. Backup & Archive Service

Store files with expiration policies for temporary backup needs.

5. File Conversion Workflows

Extract files, categorize them, and route to appropriate conversion Actors.

πŸ”’ Security & Privacy

⚠️ Important: All files stored in the Key-Value Store will have public download URLs. Do not upload sensitive or confidential data unless you understand the security implications.

Best Practices:

  • Use expiration policies for sensitive data (don't use permanent storage)
  • Consider encrypting sensitive files before upload
  • Monitor access logs if handling user data
  • Implement additional authentication layers if needed

🎯 Storage Policies Explained

Permanent Storage

  • Files are stored until manually deleted
  • No automatic cleanup
  • Use for: Long-term archives, public assets, reference files

Expiry-Based Storage

  • Files are automatically deleted after expiration
  • Set via retainSeconds (relative) or expireAt (absolute)
  • Cleanup Actor must be scheduled to run periodically
  • Use for: Temporary uploads, processing pipelines, time-limited sharing

Example: 24-hour temporary storage

{
"fileUrl": "https://example.com/temp.zip",
"storePolicy": "expire",
"retainSeconds": 86400
}

Example: Expire on specific date

{
"fileUrl": "https://example.com/event.zip",
"storePolicy": "expire",
"expireAt": "2026-02-01T00:00:00Z"
}

πŸ“Š Size Limits & Protection

To prevent abuse and control costs:

  • Max File Size: Default 500MB per file (configurable up to 2GB)
  • Max Total Size: Default 2GB per run (configurable up to 10GB)
  • Archive Bomb Protection: Extraction stops if limits are exceeded
  • Memory Efficient: Streaming downloads and extraction

πŸ› οΈ Development

Local Testing

# Install dependencies
pip install -r requirements.txt
# Set up Apify token
export APIFY_TOKEN=your_token_here
# Run locally
apify run

Input for Local Testing

Create .actor/INPUT.json:

{
"fileUrl": "https://example.com/test.zip",
"extractArchives": true,
"storePolicy": "expire",
"retainSeconds": 3600
}

πŸ› Troubleshooting

"No file found with key 'INPUT'"

  • Make sure you uploaded a file in the Apify Console
  • Or provide a fileUrl instead

"Unsupported archive format"

  • Check if the file extension is supported
  • Some formats (like password-protected archives) are not supported

"File exceeds maximum size"

  • Increase maxFileSize or maxTotalSize parameters
  • Or split your archive into smaller parts

Extraction fails silently

  • Check Actor logs for detailed error messages
  • Verify the archive is not corrupted
  • Ensure the archive format is supported

πŸ“š API Reference

Apify SDK Documentation

  • Use with file conversion Actors
  • Chain with data processing Actors
  • Integrate with notification Actors

πŸ› οΈ Apify API Endpoints

Run Actor (Async)

POST https://api.apify.com/v2/acts/mikolabs~zip-extractor/runs?token=***

Run Actor Sync + Get OUTPUT

POST https://api.apify.com/v2/acts/mikolabs~zip-extractor/run-sync?token=***

Run Sync + Return Dataset Items

POST https://api.apify.com/v2/acts/mikolabs~zip-extractor/run-sync-get-dataset-items?token=***

Get Actor Details

GET https://api.apify.com/v2/acts/mikolabs~zip-extractor?token=***

OpenAPI Definition

GET https://api.apify.com/v2/acts/mikolabs~zip-extractor/builds/default/openapi.json

πŸ“ License

This Actor is provided as-is for use on the Apify platform.

🀝 Support

For issues, questions, or feature requests:

  • Check the Apify Documentation
  • Contact support via Apify Console
  • Review Actor logs for detailed error messages

Built with ❀️ for the Apify platform