Jumia Scraper avatar

Jumia Scraper

Deprecated
Go to Store
This Actor is deprecated

This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?

See alternative Actors
Jumia Scraper

Jumia Scraper

crucial_binoculars/jumia-scraper

Jumia Scraper extracts data from any Jumia product, catalog or category pages - parses and converts the data to structured formats: HTML table, JSON, CSV, Excel and XML.

Jumia Scraper extracts data from any Jumia product, catalog or category pages - parses and converts the data to structured formats: HTML table, JSON, CSV, Excel and XML.

By providing a search term (keyword) and/or category URL, you can extract prices, product descriptions, images, stock availability, brand, category, sku, etc.

Why scrape Jumia products?

The extracted data can be useful in the following ways:

  • Optimize product prices
  • Monitor practices employed by your competitors
  • Understand market dynamics to boost productivity
  • Harness the power of prevailing trends, etc.

Cost of usage

Apify generously offers you $5 free usage credits monthly. From our tests, you can scrape upto 400,000 results. This should be good enough for most of your use cases but you can upgrade to a higher plan to extract even more data.

Sample input parameter

Below is a sample of the input

1{
2  "startUrls": ["https://www.jumia.com.ng/health-beauty/"],
3  "maxItems": 2000,
4  "proxyConfig": {
5    "useApifyProxy": false
6  },
7  "maxConcurrency": 1,
8  "handlePageTimeoutSecs": 30000,
9  "maxRequestRetries": 3
10}
  • maxItems: this specifies the maximum number or results you want to extract for a given run.
  • proxyConfig: This field provides you with the option to use a proxy for the run or not. You can set the value to {"useApifyProxy": true} if you are getting blocked. By default, the scraper will use Apify's datacenter proxies but you can provide one instead, manually.
  • maxConcurrency: This specifies the number of concurrent operations. To get your results quickly, set this to a higher number.
  • handlePageTimeoutSecs: Number of seconds elapsed to mark a page as timeout.
  • maxRequestRetries: Number of times the scraper will retry handling a failed page. The default value is 3 times.
  • startUrls: This is an array of static URLs to be added to the request queue. It can be a product or category page.

Sample output

1{
2    "id": 104317151,
3    "sku": "NI930ST1WKIRWNAFAMZ",
4    "title": "NIVEA Perfect & Radiant Body Lotion For Women - 400ml (Pack Of 2)",
5    "name": "Perfect & Radiant Body Lotion For Women - 400ml (Pack Of 2)",
6    "price": "6550.00",
7    "currency": "₦",
8    "rating": {
9      "average": 4.1,
10      "totalRatings": 4784
11    },
12    "url": "https://www.jumia.com.ng/natural-fairness-clarifiant-for-women-400ml-pack-of-2-nivea-mpg1657699.html",
13    "categories": [
14      "Health & Beauty",
15      "Beauty & Personal Care",
16      "Personal Care",
17      "Skin Care",
18      "Body",
19      "Moisturizers",
20      "Lotions"
21    ],
22    "images": [
23      "https://ng.jumia.is/unsafe/fit-in/300x300/filters:fill(white)/product/15/1713401/1.jpg?0547",
24      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/1.jpg?0547",
25      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/2.jpg?0488",
26      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/3.jpg?0498",
27      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/4.jpg?0508",
28      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/5.jpg?0519",
29      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/6.jpg?0530",
30      "https://ng.jumia.is/unsafe/fit-in/680x680/filters:fill(white)/product/15/1713401/7.jpg?0538"
31    ]
32  }

Supported URL types

Limitation and solution

Category and Catalog pages can return a maximum of 2,000 results (this is a hard limit by Jumia). To extract more data, you can apply several filters e.g by price, brand, etc.

Pagination

The scraper handles category and catalog pages pagination automatically, in a bid to extract as much data as possible. You can then use the maxCrawledProducts parameter to set the maximum number of the results to be obtained.

Developer
Maintained by Community