Shopee Api Scraper avatar
Shopee Api Scraper
Try for free

14 days trial then $50.00/month - No credit card required now

View all Actors
Shopee Api Scraper

Shopee Api Scraper

marc_plouhinec/shopee-api-scraper
Try for free

14 days trial then $50.00/month - No credit card required now

Query Shopee's unofficial API for product searches by keyword, category, or shop. Access detailed information including prices, orders, stock levels, and ratings. Also retrieve related entities like the category tree, shop listings, and keyword suggestions.

Change Log

Version 1.16 - Released 2024-05-04

Dependencies Update:

Version 1.15 - Released 2024-04-19

Fixes:

  • Login Redirection: Updated the scraper to prevent unintended redirects to the login page introduced by a recent Shopee update.

Version 1.14 - Released 2024-04-07

Enhancements:

  • Product Search with brandids: Allow products to be searched by brand IDs. Example:
    1{
    2  "requests": [
    3    { "url": "https://shopee.co.id/api/v4/search/search_items?brandids=2387597" },
    4    { "url": "https://shopee.co.id/search?brands=2387597" }
    5  ]
    6}

Version 1.13 - Released 2024-04-03

Fixes and Enhancements:

  • HTTP Response Caching Optimization: Improved efficiency by removing unnecessary data format conversion when caching HTTP responses.
  • Product Search without keyword or match_id Fix: Resolved a bug that previously prevented product searches using only categoryids. This fix enables users to conduct product searches without specifying a keyword or match_id, broadening search capabilities. Example:
    1{
    2  "requests": [
    3    { "url": "https://shopee.co.id/api/v4/search/search_items?by=relevancy&page_type=search&scenario=PAGE_OTHERS&categoryids=11043509%2C11043508" }
    4  ]
    5}

New Features:

  • Simplified Shop and Product Scraping: Users can now scrape shop products using the shop's page URL, simplifying the process for new users. Example:
    1{
    2  "requests": [
    3      {
    4          "url": "https://shopee.co.id/pompurinstore?page=0&sortBy=sales"
    5      }
    6  ],
    7  "shopDetail_crawlShopProducts": true
    8}
  • Sales-Based Page Navigation Stop: Added a new parameter shopProducts_crawlNextPages_minSales to automatically stop navigating to the next pages of a shop's product listing when encountering products with sales equal to or below a specified threshold. This feature is particularly useful for scraping all selling products of a shop while avoiding those with no sales. Example:
    1{
    2  "requests": [
    3      {
    4          "url": "https://shopee.co.id/api/v4/shop/rcmd_items?shop_id=51925611&sort_type=13" // sort by sales
    5      }
    6  ],
    7  "shopProducts_enrichUrlQuery_pageSize": 30,
    8  "shopProducts_crawlNextPages": true,
    9  "shopProducts_crawlNextPages_minSales": 0
    10}

    Note: the above features can be combined for enhanced functionality:

    1{
    2   "requests": [
    3     {
    4       "url": "https://shopee.co.id/pompurinstore?page=0&sortBy=sales"
    5     }
    6   ],
    7   "shopDetail_crawlShopProducts": true,
    8   "shopProducts_enrichUrlQuery_pageSize": 30,
    9   "shopProducts_crawlNextPages": true,
    10   "shopProducts_crawlNextPages_minSales": 0
    11 }

Version 1.12 - Released 2024-04-02

Fixes:

  • Improved Scraping Speed: Fixed a bug that unnecessarily extended scraping response times by 5 seconds, regardless of actual response time. The scraper now adapts its wait time to the actual response speed, significantly enhancing efficiency.

Version 1.11 - Released 2024-04-02

Adaptations to Shopee Updates:

  • Shop Product API update: Use the POST method to scrape shop products.

Version 1.10 - Released 2024-04-02

Fixes:

Version 1.9 - Released 2024-04-01

Enhancements and Optimizations:

  • Re-enabled Key Optimizations: Following the switch from Puppeteer to the Chrome DevTools Protocol (CDP), several key optimizations have been re-enabled to improve performance. These include:
    • Shared Cache: Utilizing a shared browser cache that persists across sessions, reducing load times and proxy traffic.
    • Web Browser Fingerprint Injection: To mitigate detection risks, enhancing scraping effectiveness.
    • Chromium Flags Adjustment: Restored the use of specific Chromium flags that contribute to more efficient browser automation and resource usage.

These adjustments are designed to optimize the scraper's performance, making it faster and more efficient while maintaining the necessary precautions to avoid detection by Shopee's enhanced anti-bot measures.

Dependencies Update:

Version 1.8 - Released 2024-03-29

Updates in Response to Shopee’s Enhanced Anti-Bot Protection:

Shopee has recently upgraded their anti-bot defenses, affecting the functionality of the Shopee Api Scraper. They are now able to detect Puppeteer, as well as certain aspects of the Chrome DevTools Protocol (CDP)—a foundational technology for automation tools like Puppeteer. Notably, they can now identify the use of the Runtime Domain, which is crucial for executing JavaScript.

Additionally, Shopee can now detect when the Developer Tools panel is open, making debugging and reverse engineering even harder.

  • Switching to Chrome DevTools Protocol (CDP) for Browser Automation: To circumvent Puppeteer's detection, Shopee Api Scraper now use CDP for browser automation.

Version 1.7 - Released 2024-03-26

Adaptations to Shopee Updates:

  • Browser Navigation Mode Adjustment: In response to Shopee's new detection mechanisms for browsers running in incognito mode, the web browser is now running in normal mode. To maintain session isolation, a unique random directory is generated to store user profiles.
  • Fingerprint Generation Workaround: There is an issue with the injection of the navigator.userAgentData parameter by the fingerprint injector. As a workaround, the injection of this parameter has been disabled.

Dependencies Update:

Version 1.6 - Released 2024-03-11

Performance Enhancements:

  • Optimized Timeout Parameters and CPU Usage: Improved the scraper's responsiveness and efficiency under heavy CPU load by tuning timeout parameters and implementing a waiting loop. This enhancement ensures the scraper maintains performance even when the system is pushed to its limits, providing more stability during the scraping process, especially in high-demand scenarios.

Version 1.5 - Released 2024-03-10

Major Improvements:

  • Startup Time and Efficiency: Implemented significant enhancements to reduce the scraper's startup time and improve overall efficiency:
    • Login Page Initialization: Shifted from loading Shopee's homepage to opening the login page for initial checks. The login page is lighter, leading to quicker load times.
    • Enhanced Bot Detection Handling: Optimized the handling of bot detections. If the scraper is detected as a bot, instead of closing and reopening a new page—which is resource-intensive—we now reload the current page. This change significantly saves on CPU usage and proxy traffic, benefiting from browser caching and allowing for multiple attempts with the same IP but different browser fingerprints.
    • Persistent Browser Cache: Developed a shared browser cache system that retains cached resources even after closing and reopening incognito pages or the browser itself. This innovation results in further time savings and reduced proxy traffic.

Dependencies Update:

  • Updated Puppeteer to 22.4.1: Along with other routine updates, upgraded Puppeteer to version 22.4.1, ensuring the scraper runs on the latest and most stable version for enhanced performance.

Version 1.4 - Released 2024-03-07

Fixes:

  • Resolved Security SDK Access Issue: Addressed a challenge resulting from Shopee's recent website updates, specifically their changes in integrating the anti-bot Security SDK. Previously, our scraper relied on a global function to interact with this SDK, a method rendered ineffective after Shopee removed this global function.

    To circumvent this, we now intercept and modify one of Shopee's scripts to reinstate the Security SDK's global function.

Version 1.3 - Released 2024-03-04

Fixes:

  • productRatings_autofixError10002 Feature Repair: Fixed an issue where the productRatings_autofixError10002 feature was not functioning as intended. This feature, when enabled, now correctly addresses the error 10002 from the product ratings API by splitting problematic requests into multiple smaller ones, ensuring reliable data retrieval.

Version 1.2 - Released 2024-03-04

Major Changes:

  • Switch to Ungoogled-Chromium: Transitioned from Google Chrome to ungoogled-chromium for web scraping operations. This version matches the latest version of Chrome (122.x), significantly reducing the frequency of being blocked by websites, roughly by half, leading to more efficient scraping sessions and decreased proxy traffic.

Performance Enhancements:

  • Browser Flags Optimization: Fine-tuned browser flags to slightly improve the overall performance of the scraping tasks. These adjustments contribute to faster data retrieval and processing.

Dependencies Update:

Version 1.1 - Released 2024-02-22

New Features:

  • Simplified Page URL Scraping: Enhanced the user experience by supporting product search and detail scraping directly from page URLs. Users can now simply copy and paste product URLs from their browser for scraping. This eliminates the need to manually construct Shopee Unofficial API URLs.
    • Example: Directly use https://shopee.sg/Japanese-Samurai-T-Shirt-New-Samurai-T-Shirt-Distro-T-Shirts-i.414243778.23600133176 instead of crafting an API URL.

Improvements:

  • Automated Browser Restart: Introduced an automated browser restart mechanism after four consecutive scraping blockages, improving resilience against anti-scraping measures. Customize this threshold with the homepageLoadMaxConsecutiveRetries parameter.
  • Max Consecutive Blockages Control: Added maxConsecutiveBlockages parameter to halt operations when encountering excessive consecutive blocks, safeguarding against low-quality proxies or updated anti-bot technologies.

Enhancements for Troubleshooting:

  • Verbose Startup Failures: Expanded log details during the startup phase to better diagnose failures, such as redirections or proxy issues.
  • Homepage Load Timeout Detection: Optimized detection of homepage load timeouts by identifying a lack of network activity within 20 seconds, adjustable via homepageNoTrafficTimeoutInSec.

Fixes:

  • Puppeteer Main Frame Request Timing: Resolved the Requesting main frame too early error by implementing a forced page reload strategy.

Dependencies Update:

Version 1.0 - Released 2024-02-03

  • Launch: Initial release of the Shopee API Scraper, introducing core functionality for scraping product information from Shopee.
Developer
Maintained by Community
Actor metrics
  • 39 monthly users
  • 0 stars
  • 99.9% runs succeeded
  • 9.4 hours response time
  • Created in Jan 2024
  • Modified 24 days ago
Categories