Made In China Scraper avatar
Made In China Scraper

Pricing

$2.50/month + usage

Go to Apify Store
Made In China Scraper

Made In China Scraper

Extract comprehensive product data from made-in-china.com supplier stores including pricing, specifications, certifications, and supplier details.

Pricing

$2.50/month + usage

Rating

0.0

(0)

Developer

Akash Kumar Naik

Akash Kumar Naik

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Share

Made-in-China.com Product Scraper

Comprehensive Apify actor for extracting detailed product data from made-in-china.com supplier stores.

Features

  • Comprehensive Data Extraction: Extracts product titles, descriptions, pricing tiers, specifications, images, supplier details, MOQ, lead times, and certifications
  • Pagination Support: Automatically handles multi-page product listings
  • Error Handling: Robust retry mechanism with exponential backoff
  • Proxy Support: Built-in proxy rotation for reliable scraping
  • Rate Limiting: Respects website rate limits and robots.txt guidelines
  • Multiple Export Formats: JSON and CSV output options
  • Session Management: Intelligent session rotation on persistent failures

Input Configuration

{
"startUrls": [
{
"url": "https://sunlink-furniture.en.made-in-china.com/product-group/zegaGQnxWIVp/Recliner-Sofa-1.html"
}
],
"maxItems": 0,
"maxConcurrency": 5,
"maxRetries": 3,
"exportFormat": "JSON",
"includeImages": true,
"respectRobotsTxt": true,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Input Parameters

  • startUrls (array, required): List of product group or product detail page URLs to scrape
  • maxItems (integer): Maximum number of products to scrape (0 = unlimited)
  • maxConcurrency (integer): Number of concurrent requests (default: 5)
  • maxRetries (integer): Retry attempts for failed requests (default: 3)
  • exportFormat (string): Output format - "JSON", "CSV", or "BOTH" (default: JSON)
  • includeImages (boolean): Extract all product images (default: true)
  • respectRobotsTxt (boolean): Follow robots.txt guidelines (default: true)
  • proxyConfiguration (object): Proxy settings (recommended: use Apify proxy)

Output Data Schema

Each scraped product includes:

{
"url": "Product URL",
"scrapedAt": "ISO timestamp",
"title": "Product title",
"modelNumber": "Model number",
"description": "Product description",
"fobPrice": "FOB price range",
"pricingTiers": [
{ "quantity": "1-9 pcs", "price": "US $220" }
],
"moq": "Minimum order quantity",
"leadTime": "Delivery lead time",
"images": ["Array of image URLs"],
"specifications": {
"Material": "PU",
"Frame Material": "Metal",
...
},
"certifications": ["SGS", "CE"],
"supplierName": "Supplier company name",
"supplierRating": "Rating score",
"supplierYears": "Years in business",
"supplierVerified": true/false,
"supplierContact": {
"email": "contact@example.com",
"phone": "+86 xxx",
"website": "https://..."
}
}

Robots.txt Compliance

This scraper respects made-in-china.com's robots.txt file by default. The following paths are avoided:

  • /getValidateimage
  • /validateimage/
  • /human_verify.action
  • /ref/*
  • /print/
  • And other restricted paths

Best Practices

  • Use proxy rotation (Apify proxy recommended) to avoid IP blocks
  • Keep maxConcurrency at 5 or lower to respect rate limits
  • Enable respectRobotsTxt for ethical scraping
  • Monitor your runs and adjust retry settings if needed

Troubleshooting

Issue: Getting blocked or rate limited

  • Solution: Enable proxy configuration and reduce concurrency

Issue: Missing data fields

  • Solution: Website structure may have changed - check selector updates needed

Issue: Timeout errors

  • Solution: Increase maxRetries or reduce maxConcurrency