Yupoo Bulk Image Downloader & Category Scraper avatar

Yupoo Bulk Image Downloader & Category Scraper

Pricing

from $0.20 / 1,000 image assets processeds

Go to Apify Store
Yupoo Bulk Image Downloader & Category Scraper

Yupoo Bulk Image Downloader & Category Scraper

Extracts high-resolution images from Yupoo albums and entire categories. Features auto-pagination, WhatsApp link detection, parallel downloads, and dynamic generation of downloadable ZIP archives for each album.

Pricing

from $0.20 / 1,000 image assets processeds

Rating

0.0

(0)

Developer

Ahmed Jasarevic

Ahmed Jasarevic

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

2

Monthly active users

9 days ago

Last modified

Share

A fast, highly efficient, and lightweight Apify Actor designed to scrape individual albums or entire categories from Yupoo. Built using got-scraping and cheerio, this tool completely bypasses heavy browser overhead, enabling rapid extraction, automated WhatsApp link detection, parallel downloads, and dynamic in-memory ZIP generation.


Features

  • Dual Input Modes: Accepts both individual album URLs and full category pagination URLs.
  • Auto-Pagination: Detects category pages, maps all internal pages, and bulk-extracts album references automatically.
  • Parallel Image Processing: Downloads high-res images concurrently using simulated worker streams.
  • In-Memory Compression: Compresses assets directly into an optimized ZIP archive saved into your Key-Value Store.
  • Sanitized Filenames: Automatically cleans album titles to prevent illegal filesystem characters and path length overflows.
  • Contact Extraction: Extracts contact pointers like WhatsApp/wa.me strings directly from the source HTML.
  • Structured Live Dataset: Feeds individual image logs and global album summary records into distinct Apify dataset views.

Input Configuration (JSON)

The Actor accepts a standard Apify input configuration containing an array of target URLs. You can pass single album links or category index pages.

input_schema.json Example

{
"startUrls": [
{
"url": "[https://yupoolyp.x.yupoo.com/categories/3125933](https://yupoolyp.x.yupoo.com/categories/3125933)"
},
{
"url": "[https://yupoolyp.x.yupoo.com/albums/101441230](https://yupoolyp.x.yupoo.com/albums/101441230)"
}
]
}

Input Fields Description

FieldTypeRequiredDescription
startUrlsArray (Object/String)YesA list of target URLs. Can point directly to a specific product album or a main category directory containing multiple paginated listings.

How It Works

  1. Routing & Detection: The scraper evaluates whether an entry in startUrls is a standalone album or a /categories/ endpoint.
  2. Category Exploding: If a category is matched, it fetches the root page, evaluates .pagination__number selectors to calculate the maximum depth, scrapes all sub-pages, and pushes found album targets into the execution queue.
  3. Data Extraction: For each targeted album, it fetches raw HTML, isolates metadata (userId, albumName, whatsapp), and locates image elements across .image__main and .showalbum__children containers.
  4. Download & Package: Media records are fetched concurrently using custom referer bypass headers, packed into a safe zipped filename format (userId_sanitized_album_name.zip), and pushed straight into the default Key-Value store.

Output Architecture

Data is published progressively to optimize speed and readability, separating individual image asset records from structural summaries.

1. Image Record (image_record)

Emitted instantly as workers complete downstreams for each image asset.

{
"type": "image_record",
"albumId": "101441230",
"albumName": "LV",
"imageIndex": 2,
"imageName": "95a9567a.jpg",
"imageUrl": "[https://photo.yupoo.com/yupoolyp/57afeaf5/95a9567a.jpg](https://photo.yupoo.com/yupoolyp/57afeaf5/95a9567a.jpg)",
"whatsapp": "[https://x.yupoo.com/external?url=https%253A%252F%252Fwa.me%252Fmessage%252F3CCGSKIWVMIKJ1](https://x.yupoo.com/external?url=https%253A%252F%252Fwa.me%252Fmessage%252F3CCGSKIWVMIKJ1)",
"zipUrl": "[https://api.apify.com/v2/key-value-stores/default/records/3125933_LV.zip?disableRedirect=true](https://api.apify.com/v2/key-value-stores/default/records/3125933_LV.zip?disableRedirect=true)"
}

2. Summary Record (summary_record)

Appended to the dataset immediately after a single album finishes compression and storage syncing.

{
"type": "summary_record",
"albumId": "101441230",
"albumName": "LV",
"userId": "3125933",
"whatsapp": "[https://x.yupoo.com/external?url=https%253A%252F%252Fwa.me%252Fmessage%252F3CCGSKIWVMIKJ1](https://x.yupoo.com/external?url=https%253A%252F%252Fwa.me%252Fmessage%252F3CCGSKIWVMIKJ1)",
"totalPhotosProcessed": 9,
"zipFileName": "3125933_LV.zip",
"downloadZipUrl": "[https://api.apify.com/v2/key-value-stores/default/records/3125933_LV.zip?disableRedirect=true](https://api.apify.com/v2/key-value-stores/default/records/3125933_LV.zip?disableRedirect=true)",
"status": "SUCCESS"
}