7-Zip Recursive Archive Extractor: Enterprise-Grade Automation avatar
7-Zip Recursive Archive Extractor: Enterprise-Grade Automation

Pricing

Pay per event

Go to Apify Store
7-Zip Recursive Archive Extractor: Enterprise-Grade Automation

7-Zip Recursive Archive Extractor: Enterprise-Grade Automation

High-performance 7-Zip extractor supporting 30+ formats (ZIP, RAR, 7Z, TAR, ISO, GZIP, BZIP2, XZ, CAB & more). Features recursive nested archive extraction, CRC-based incremental updates to skip unchanged files, security filtering, and dual KV Store + Dataset output. Saves up to 90% compute costs.

Pricing

Pay per event

Rating

5.0

(3)

Developer

Adrian Nicolae

Adrian Nicolae

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

2 days ago

Last modified

Share

7-Zip Recursive Archive Extractor: Enterprise-Grade Archive Automation

Extract, process, and index 100GB+ archives with recursive nesting, incremental updates, and CRC-based change detection - all powered by 7-Zip on Apify's cloud infrastructure. Configure download/file limits and Actor memory to match your target sizes.


The Powerhouse Archive Solution You've Been Looking For

Most unzip tools choke on nested archives, can't skip unchanged files, or force you to download everything locally. This Apify Actor is different: it's a high-performance 7-Zip extractor designed for automated data pipelines, legacy archive migration, and security-first filtering at cloud scale.

What makes this a "powerhouse"?

  • 30+ Format Support: ZIP, RAR, 7Z, TAR, GZIP, BZIP2, XZ, ISO, CAB, ARC, ZIPX, and every format 7-Zip recognizes
  • Recursive Extraction: Automatically detects and processes archives inside archives (up to configurable depth)
  • Incremental Intelligence: CRC32+size signatures let you skip unchanged files between runs - saving up to 90% of compute costs
  • Dual-Storage Architecture: Raw files go to KV Store (direct download links), metadata goes to Dataset (structured queries)
  • Security Hardened: Blocks executables by default, validates paths against traversal attacks, enforces size limits

How It Works: The Data Flow

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ URL Input โ”‚ โ”‚ URL List โ”‚ โ”‚ Dataset (URL Field) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 7-Zip Extractor Engine โ”‚
โ”‚ โ€ข Download with guards โ”‚
โ”‚ โ€ข Detect format โ”‚
โ”‚ โ€ข Incremental check โ”‚
โ”‚ โ€ข Extract files โ”‚
โ”‚ โ€ข Recurse if nested โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ KV Store โ”‚ โ”‚ Dataset โ”‚
โ”‚ (Raw Files) โ”‚ โ”‚ (Structured Index) โ”‚
โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ โ€ข files โ”‚ โ”‚ โ€ข archiveUrl โ”‚
โ”‚ โ€ข SUMMARY โ”‚ โ”‚ โ€ข path โ”‚
โ”‚ โ€ข INCR__sha1 โ”‚ โ”‚ โ€ข kvKey โ”‚
โ”‚ โ€ข OUTPUT_... โ”‚ โ”‚ โ€ข status โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ€ข incrementalStatus โ”‚
โ”‚ โ€ข pointer rows โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Result: Both downloadable files (KV Store) and a searchable index (Dataset) are provided with per-file metadata including extraction status, size, extension, and incremental comparison results.


The Format Wall: Every Extension Supported

7-Zip's comprehensive format support is leveraged to handle virtually any compressed or archived file:

Primary Archive Formats

ZIP โ€ข RAR โ€ข 7Z โ€ข TAR โ€ข GZIP โ€ข BZIP2 โ€ข XZ โ€ข LZMA

Disk Image & Container Formats

ISO โ€ข VHD โ€ข VHDX โ€ข WIM โ€ข SWM โ€ข DMG โ€ข HFS โ€ข NTFS โ€ข FAT โ€ข SquashFS โ€ข UDF

Legacy & Specialty Formats

CAB โ€ข CHM โ€ข MSI โ€ข ARJ โ€ข LZH โ€ข CPIO โ€ข RPM โ€ข DEB โ€ข ARC โ€ข ZIPX โ€ข SWF/SWFC โ€ข NSIS

Compressed TAR Variants

TGZ (tar.gz) โ€ข TBZ2 (tar.bz2) โ€ข TXZ (tar.xz) โ€ข TAR.LZMA โ€ข TAR.Z

Single-File Compression

Z โ€ข LZW โ€ข LZIP โ€ข LZOP โ€ข ZSTD โ€ข BROTLI โ€ข BASE64 โ€ข HASH

Can't find your format? If 7-Zip can list it with 7z l, it can be extracted. Check the debug logs for supported format detection.


Key Advantages: The Technical Moat

1. Incremental Extraction: The Compute Cost Killer

Traditional extractors re-process every file on every run. This Actor tracks file signatures (CRC32 + size) per archive URL and builds an incremental index:

How it works:

  1. First run: All files are extracted, CRC32 checksums computed, signatures stored in KV Store under INCR::{sha1(archiveUrl)}
  2. Subsequent runs: Current archive contents are compared against stored signatures
  3. Unchanged files are skipped (marked SKIPPED_UNCHANGED in dataset)
  4. Only new or modified files are re-extracted (marked new or changed)

Real-world impact: A daily job processing a 10GB archive with 5,000 files where only 100 files change will:

  • Original approach: Extract 5,000 files every day
  • Incremental approach: Extract 100 files after first run (98% reduction)
  • Cost savings: ~90% reduction in compute units

Configuration:

{
"incremental": {
"enabled": true,
"strategy": "crc+size",
"onlyNewOrChanged": true
}
}

Note: sizeOnly strategy can be used for massive archives where CRC computation overhead outweighs accuracy benefits.


2. Recursive Nested Archive Extraction

Legacy systems often store "archives within archives" (e.g., daily backups as backup-2025-01-15.zip inside monthly-archives.tar.gz inside year-2025.7z). Most tools require manual multi-pass extraction.

Automatic nesting handling:

  • Nested archives are detected by extension (zip, rar, 7z, tar, gz, bz2, xz, tgz, iso, cab, arc, zipx)
  • Parent is extracted โ†’ contents scanned โ†’ child archives extracted โ†’ repeated up to configured depth
  • Nesting level is tracked in dataset (nestedDepth: 0, 1, 2, ...)
  • archiveUrl includes #path suffix to trace file origins: https://ex.com/data.zip#backup/2025/jan.tar.gz#reports/

Configuration:

{
"formats": {
"extractNestedArchives": true,
"nestedArchiveDepth": 2,
"nestedBeyondDepthBehavior": "skip"
}
}

Use cases:

  • Legacy tape backups: Multi-level TAR archives from enterprise backup systems
  • Software distributions: Installers containing compressed packages containing archives
  • Data dumps: Database exports compressed multiple times for size reduction

3. Data Cleaning & UTF-8 Normalization

International datasets often contain mixed encodings (Windows-1252, ISO-8859-1, Shift-JIS, etc.), causing "mojibake" (corrupt characters) when parsed.

Text normalization features:

  • Common text extensions are treated as candidates for UTF-8 verification (.txt, .csv, .json, .xml, .html, .md, .log, .ini, .yaml)
  • When enabled, files are decoded as UTF-8 and re-stored in UTF-8 form
  • If decoding fails (file is not valid UTF-8), original bytes are stored to avoid corruption
  • Normalized content is stored in KV Store when outputMode includes KV

Configuration:

{
"textOptions": {
"convertTextToUtf8": true,
"textExtensions": [".txt", ".csv", ".json", ".xml", ".html", ".log"]
}
}

Result: Best-effort UTF-8 normalization for files already encoded in UTF-8; other encodings are preserved unchanged to avoid data loss.


4. Security-First Architecture

Archives from untrusted sources can contain malware, path traversal exploits, or resource exhaustion attacks. Multiple defensive layers are implemented:

Security features:

  • Extension blocklist: Executables are rejected by default (.exe, .dll, .bat, .cmd, .sh, .ps1, .msi)
  • Path validation: Absolute paths, drive letters, .. segments, and self-referential entries are blocked
  • Size guards: maxDownloadBytes (500 MB default) and maxFileSizeBytes (500 MB default) are enforced
  • File count limits: Processing stops after maxFiles (1000 default) to prevent zip bombs
  • Timeout protection: 7-Zip operations exceeding listTimeoutMillis are aborted

Customization:

{
"filters": {
"blockedExtensions": [".exe", ".dll", ".bat", ".sh", ".ps1", ".msi", ".scr"],
"excludedPatterns": ["__MACOSX/", ".DS_Store", "Thumbs.db"]
},
"limits": {
"maxDownloadBytes": 524288000,
"maxFileSizeBytes": 524288000,
"maxFiles": 1000
}
}

Use Cases: Real-World Applications

1. Automated Data Pipeline: Daily Archive Ingestion

Scenario: Your company receives daily data exports as ZIP files on an FTP server. CSVs need extraction, schema validation, and loading into a data warehouse.

Implementation:

{
"url": "https://ftp.partner.com/exports/daily-2025-01-15.zip",
"filters": {
"allowedExtensions": [".csv", ".json"],
"blockedExtensions": [".exe", ".dll"]
},
"incremental": {
"enabled": true,
"onlyNewOrChanged": true
},
"outputOptions": {
"datasetName": "daily-extracts",
"storeName": "raw-files"
},
"webhook": {
"url": "https://pipeline.yourcompany.com/archive-complete",
"secret": "your-hmac-secret"
}
}

Workflow:

  1. CSVs are extracted to KV Store raw-files
  2. File index is written to Dataset daily-extracts
  3. Webhook triggers downstream validation Actor
  4. Unchanged files are skipped on subsequent runs (incremental mode)

Benefits: Fully automated, cost-optimized, with webhook integration for pipeline orchestration.


2. Legacy Archive Migration: Enterprise Data Modernization

Scenario: 20 years of legacy backups (nested TARs and RARs) need migration from on-premise storage to cloud object storage. Archives are deeply nested (3-4 levels) with mixed compression.

Implementation:

{
"datasetId": "legacy-archive-inventory",
"urlField": "backupUrls",
"formats": {
"extractNestedArchives": true,
"nestedArchiveDepth": 4,
"archiveTypes": ["tar", "rar", "zip", "7z", "gz", "bz2"]
},
"limits": {
"maxDownloadBytes": 2000000000,
"maxFileSizeBytes": 2000000000,
"maxFiles": 0
},
"concurrency": 5,
"errorHandling": {
"mode": "lenient",
"maxPerArchiveErrors": 100
}
}

Workflow:

  1. Archive URLs are read from inventory dataset
  2. 4 levels deep recursive extraction is performed
  3. Errors are logged without stopping (lenient mode)
  4. Per-archive success rates are shown in final summary

Benefits: Corrupted/incomplete backups are handled gracefully, folder hierarchy is preserved, audit trail is provided via dataset.


3. Security-First Filtering: Malware-Free Document Extraction

Scenario: Documents (PDFs, DOCs) need extraction from user-uploaded archives while blocking executables and scripts. Archives may come from untrusted sources.

Implementation:

{
"url": "https://uploads.example.com/user-123/documents.zip",
"filters": {
"allowedExtensions": [".pdf", ".doc", ".docx", ".txt", ".md"],
"blockedExtensions": [".exe", ".dll", ".bat", ".cmd", ".sh", ".ps1", ".msi", ".scr", ".vbs", ".js"],
"excludedPatterns": ["__MACOSX/", ".DS_Store", "desktop.ini"]
},
"limits": {
"maxDownloadBytes": 104857600,
"maxFileSizeBytes": 524288000,
"maxFiles": 500
},
"formats": {
"extractNestedArchives": false
},
"errorHandling": {
"mode": "strict"
}
}

Workflow:

  1. User archive is downloaded with size guard
  2. Executables and scripts are rejected
  3. Only documents are extracted to KV Store
  4. Processing aborts if malicious content is detected (strict mode)

Benefits: Downstream systems are protected from malware, attack surface is reduced, audit trail is provided.


Outputs: Understanding the Dual-Storage Model

Apify's two-storage architecture is used to provide maximum flexibility:

KV Store (Key-Value Store)

Purpose: Raw binary content storage for direct file downloads

Contents:

  • Extracted files: Stored with archive paths as keys (e.g., reports/2025/january.pdf)
  • Flattened mode: Optional deterministic keys like january.pdf-a1b2c3d4 to avoid collisions
  • Summary record: JSON at SUMMARY key (or custom key) with run statistics
  • Incremental indexes: State records at INCR::{sha1(archiveUrl)} for change tracking
  • Output pointers: When custom names are used, OUTPUT_POINTERS contains destination metadata

Access: Apify Console โ†’ Storage โ†’ Key-Value Stores โ†’ (your store name) โ†’ Records

API endpoint: {{links.apiDefaultKeyValueStoreUrl}}/records/{KEY}


Dataset (Structured Index)

Purpose: Queryable per-file metadata for filtering, searching, and analytics

Schema:

{
"archiveUrl": "https://example.com/data.zip#nested.tar.gz",
"path": "reports/2025/january.csv",
"kvKey": "reports/2025/january.csv",
"sizeBytes": 1048576,
"extension": ".csv",
"status": "EXTRACTED",
"nestedDepth": 1,
"incrementalStatus": "changed",
"errorCode": null,
"errorMessage": null
}

Statuses:

  • EXTRACTED: Successfully extracted and stored
  • SKIPPED_UNCHANGED: Incremental mode detected no changes
  • SKIPPED_FILTERED: Blocked by extension/path filters
  • SKIPPED_MAX_FILES: Exceeded file count limit
  • TOO_LARGE: File size exceeds maxFileSizeBytes
  • ERROR: Extraction failed (see errorCode/errorMessage)
  • DOWNLOAD_ERROR: Archive download failed
  • ARCHIVE_ERROR: Archive-level error (corrupt, unsupported)
  • SKIPPED_NESTED_TOO_DEEP: Nested beyond nestedArchiveDepth

Incremental statuses: new, changed, unchanged, or null (when incremental disabled)

Access: Apify Console โ†’ Storage โ†’ Datasets โ†’ (your dataset name) โ†’ Items

API endpoint: {{links.apiDefaultDatasetUrl}}/items


Custom Output Destinations

When custom storeName or datasetName is specified in outputOptions, a pointer row is written to help locate the data:

Dataset pointer row:

{
"type": "pointer",
"kvStoreName": "my-custom-store",
"kvStoreId": "abc123",
"kvUrl": "https://api.apify.com/v2/key-value-stores/abc123/records",
"datasetName": "my-custom-dataset",
"datasetId": "xyz789",
"datasetUrl": "https://api.apify.com/v2/datasets/xyz789/items",
"summaryKey": "SUMMARY",
"summaryUrl": "https://api.apify.com/v2/key-value-stores/abc123/records/SUMMARY",
"incrementalIndexPrefix": "INCR::"
}

KV Store pointer (OUTPUT_POINTERS key): Same metadata is contained in the default KV store even when custom destinations are used.


Summary Payload: Run Statistics & Webhook Format

Stored in KV Store at SUMMARY key (or custom key) and optionally POSTed to webhook URL:

{
"startedAt": "2025-01-15T10:00:00.000Z",
"finishedAt": "2025-01-15T10:15:32.128Z",
"totals": {
"archivesProcessed": 25,
"archivesFailed": 1,
"downloadsFailed": 0,
"unsupportedArchives": 0,
"archiveErrors": 1,
"filesDiscovered": 12483,
"filesExtracted": 11250,
"filesSkipped": 1180,
"filesErrored": 53,
"skippedTooLarge": 45,
"skippedFiltered": 320,
"skippedUnchanged": 815,
"skippedMaxFiles": 0,
"skippedNestedTooDeep": 0,
"incrementalNew": 10200,
"incrementalChanged": 1050,
"incrementalUnchanged": 815,
"nestedArchivesProcessed": 78
},
"byExtension": {
".csv": { "files": 3500, "extracted": 3450 },
".json": { "files": 2800, "extracted": 2700 },
".pdf": { "files": 1950, "extracted": 1925 },
".xml": { "files": 1200, "extracted": 1180 },
".txt": { "files": 800, "extracted": 780 }
},
"byArchive": {
"https://example.com/data-2025-01-15.zip": {
"filesDiscovered": 523,
"filesExtracted": 480,
"filesSkipped": 38,
"filesErrored": 5,
"error": null
}
},
"incremental": {
"enabled": true,
"strategy": "crc+size",
"indexKeyPrefix": "INCR::"
},
"nestedArchives": {
"enabled": true,
"maxDepth": 2
}
}

Webhook signature: If webhook.secret is provided, the POST includes header:

x-universal-archive-signature: HMAC_HEX

HMAC-SHA256 of request body is computed using the secret for authenticity verification.


Input Configuration: Quick Reference

Required (choose exactly one)

// Option 1: Single URL
{ "url": "https://example.com/data.zip" }
// Option 2: Multiple URLs
{ "urls": ["https://ex.com/a.zip", "https://ex.com/b.tar.gz"] }
// Option 3: Dataset source
{
"datasetId": "abc123",
"urlField": "archiveUrl"
}
{
"outputMode": "both",
"concurrency": 10,
"incremental": {
"enabled": true,
"strategy": "crc+size"
},
"formats": {
"extractNestedArchives": true,
"nestedArchiveDepth": 2
},
"filters": {
"blockedExtensions": [".exe", ".dll", ".bat", ".sh"]
},
"limits": {
"maxDownloadBytes": 524288000,
"maxFiles": 1000
},
"errorHandling": {
"mode": "lenient"
}
}

Advanced: High-Volume Processing

{
"urls": [
"https://archives.example.com/dump-01.tar.gz",
"https://archives.example.com/dump-02.tar.gz",
"https://archives.example.com/dump-03.tar.gz"
],
"concurrency": 25,
"limits": {
"maxDownloadBytes": 2000000000,
"maxFileSizeBytes": 0,
"maxFiles": 0
},
"incremental": {
"enabled": true,
"strategy": "sizeOnly"
},
"httpOptions": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Running on Apify Platform

Getting Started:

Running the Actor:

  • Navigate to Apify Console โ†’ Actors โ†’ (your actor)
  • Configure input via UI or JSON
  • Click "Start"
  • Monitor run logs in real-time

Accessing Outputs:

  • Go to Storage โ†’ Key-Value Stores / Datasets
  • Or use API endpoints from output schema

Performance & Cost Optimization

Compute Unit Usage

  • Base cost: Depends on archive type, nesting, and memory; CU usage can be monitored per run in Apify Console
  • Incremental savings: 70-90% reduction on subsequent runs
  • Nested archives: +20% overhead per nesting level (recursive extraction)

Optimization Strategies

  1. Incremental mode should be enabled for recurring jobs
  2. sizeOnly strategy can be used for archives >1 GB with frequent updates
  3. maxFiles and maxFileSizeBytes should be set to prevent runaway costs
  4. concurrency should be reduced if memory limits are hit (default: 10)
  5. allowedExtensions should be used instead of blockedExtensions for targeted extraction

Memory Considerations

  • Default limits: Suitable for archives up to 500 MB with 1000 files
  • Large archives: Actor memory can be increased in Apify Console (1 GB, 2 GB, 4 GB)
  • Nested archives: Each nesting level requires temporary disk space (factor 2-3x archive size)

Troubleshooting

"UNSUPPORTED_ARCHIVE_TYPE" errors

Cause: Archive format not in formats.archiveTypes allowlist

Solution:

  1. Enable debug mode: "debug": true
  2. Check logs for "Supported formats: ..."
  3. Add detected format to allowlist or use ["auto"]

"DOWNLOAD_ERROR: Size limit exceeded"

Cause: Archive larger than maxDownloadBytes (500 MB default)

Solution:

{
"limits": {
"maxDownloadBytes": 2000000000
}
}

Incremental mode not skipping files

Cause: Archive URL changed (different domain/path/query params)

Solution: Incremental index is keyed by SHA1 of URL. Consistent URLs should be ensured or INCR:: keys manually copied between runs.

Out of memory errors

Cause: Archive too large for available Actor memory

Solutions:

  • Actor memory can be increased in Apify Console
  • concurrency can be reduced to free memory per archive
  • maxFiles can be set to limit extraction scope
  • Nested archive extraction can be disabled: "extractNestedArchives": false

Webhook not receiving POST

Cause: Webhook URL unreachable or HMAC signature validation failing

Debug:

  1. Actor logs should be checked for webhook POST details
  2. Webhook URL accessibility should be verified
  3. HMAC signature should be validated: HMAC-SHA256(requestBody, secret)
  4. Reverse proxy/firewall rules blocking Apify IPs should be checked

API Integration Examples

Trigger run via Apify API

curl -X POST "https://api.apify.com/v2/acts/{ACTOR_ID}/runs" \
-H "Authorization: Bearer {APIFY_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/data.zip",
"incremental": { "enabled": true }
}'

Fetch extracted files

# Get file index
curl "https://api.apify.com/v2/datasets/{DATASET_ID}/items?format=json"
# Download specific file
curl "https://api.apify.com/v2/key-value-stores/{STORE_ID}/records/reports/2025/data.csv" \
-o data.csv

Check run status

curl "https://api.apify.com/v2/actor-runs/{RUN_ID}" \
-H "Authorization: Bearer {APIFY_TOKEN}"

Frequently Asked Questions

Q: Can password-protected archives be extracted?
A: Not currently supported.

Q: What's the maximum archive size?
A: Default 500 MB via maxDownloadBytes. Higher values can be configured, but Actor memory/disk should be sufficient and timeouts adjusted for very large archives.

Q: How long are extracted files stored?
A: KV Store/Dataset records are retained according to data retention policy (default: 7 days for free tier, configurable for paid plans).

Q: Can specific files be extracted without downloading the entire archive?
A: Not directly. Full archive download is required by 7-Zip for extraction. allowedExtensions or excludedPatterns can be used to filter post-download.

Q: Is streaming extraction supported?
A: No. Archives are downloaded to disk, then extracted. Streaming extraction is incompatible with nested archive detection and incremental logic.


Feature Requests & Roadmap

Feedback is actively collected to improve this Actor. Features under consideration:

Potential Future Enhancements:

  • Password-protected archive support (with secure credential management)
  • Streaming extraction for extremely large archives
  • Direct S3/Azure Blob storage integration (skip KV Store for huge datasets)
  • Archive repair/recovery for corrupted files
  • Custom extraction callbacks for advanced filtering logic
  • Multi-part archive support (.zip.001, .zip.002, etc.)
  • Archive metadata extraction (comments, timestamps, permissions)

Submit Your Request:

Bug reports and feature requests can be submitted via the Issues tab on this Actor's page in the Apify Console. Navigate to the Actor โ†’ Issues to view existing requests or create new ones.

Features are prioritized based on user demand and technical feasibility.


License & Support

This Actor is available on the Apify Platform.

Support Channels:

  • Actor Issues: Use the Issues tab on this Actor's page for bug reports and feature requests
  • Apify Community Forum: https://community.apify.com for general questions and discussions
  • Apify Support: support@apify.com for platform-related issues (Apify customers)

Keywords for Search Optimization

7-Zip extractor, recursive archive extraction, incremental unzip, nested archive processor, bulk archive downloader, RAR extractor API, archive automation, 7z batch extractor, ISO file extractor, archive change detection, TAR extractor, GZIP extractor, BZIP2 extractor, XZ extractor, automated archive processing, cloud archive extraction, Apify archive actor, unzip API, zip extractor, archive decompression, legacy archive migration, archive pipeline automation, CRC-based incremental extraction, nested ZIP extractor, recursive TAR extraction, multi-level archive processing, enterprise archive solution, archive format converter, batch unzip tool, automated file extraction, archive metadata indexing, secure archive extraction, malware-free extraction, archive filtering, compression format support, archive validation, incremental file processing, archival data extraction, backup archive extraction, data pipeline automation, archive security filtering, compute cost optimization, cloud-native extraction