Redbubble Image Scraper From Keywords avatar

Redbubble Image Scraper From Keywords

Under maintenance
Try for free

1 day trial then $10.00/month - No credit card required now

View all Actors
This Actor is under maintenance.

This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?

See alternative Actors
Redbubble Image Scraper From Keywords

Redbubble Image Scraper From Keywords

lime_incline/redbubble-keyword-scraper
Try for free

1 day trial then $10.00/month - No credit card required now

Scrapes Redbubble exact image URLS from keywords and outputs them by page (based on popularity)

Redbubble Scraper

This project is an Actor for the Apify platform that crawls and extracts data from Redbubble based on specified search terms. It uses Puppeteer for web scraping and can store the extracted data in the Apify Dataset.

What do I need to use this?

  1. An Apify account
  2. A list of search keywords

Example Input

Search Terms

  • A list of terms to search for on Redbubble
  • For example: ["art", "poster", "t-shirt"]

Maximum Pages

  • Maximum number of pages to scrape for each search term
  • Default: 100

Maximum Retries

  • Maximum number of retries for each request
  • Default: 50

Proxy Configuration

  • Select proxies to be used by your actor
  • Default: Apify Proxy

Input Schema

The input schema for this Actor is defined in the .actor/input_schema.json file. Here's a brief overview:

1{
2    "title": "Redbubble Scraper Input",
3    "type": "object",
4    "schemaVersion": 1,
5    "properties": {
6        "searchList": {
7            "title": "Search Terms",
8            "type": "array",
9            "description": "List of terms to search for on Redbubble",
10            "editor": "stringList",
11            "prefill": ["art"],
12            "items": {
13                "type": "string"
14            }
15        },
16        "maxPage": {
17            "title": "Maximum Pages",
18            "type": "integer",
19            "description": "Maximum number of items to scrape",
20            "minimum": 1,
21            "default": 100
22        },
23        "maxRequestRetries": {
24            "title": "Maximum Retries",
25            "type": "integer",
26            "description": "Maximum number of retries for each request",
27            "minimum": 1,
28            "default": 50
29        },
30        "proxyConfiguration": {
31            "title": "Proxy Configuration",
32            "type": "object",
33            "description": "Select proxies to be used by your actor",
34            "editor": "proxy",
35            "default": { "useApifyProxy": true },
36            "sectionCaption": "Proxy",
37            "sectionDescription": "The actor will use Apify Proxy by default. You can customize the proxy settings or disable proxy usage altogether."
38        }
39    },
40    "required": ["searchList"]
41}

How it works

  1. The Actor starts by getting the input parameters.
  2. It creates a proxy configuration to work around IP blocking.
  3. A PuppeteerCrawler instance is created with the specified options.
  4. For each search term, the crawler:
    • Opens the Redbubble search page
    • Extracts child URLs from the search results
    • Pushes the extracted data to the Apify Dataset
    • Enqueues the next page if the maximum page limit hasn't been reached
  5. The process continues until all search terms and pages have been processed.

Customization

  • Concurrency: Adjust maxConcurrency in the PuppeteerCrawler options to control how many pages are processed in parallel.
  • Proxy: Modify the proxyConfiguration to use your own proxies or Apify's proxy service.
  • Error Handling: The Actor includes custom error handling for proxy-related issues and general request failures.

Tips for Effective Use

  • Start with a small number of search terms to test the Actor's performance.
  • Use specific search terms to get more targeted results.
  • Regularly check Redbubble's robots.txt and terms of service to ensure compliance.
  • Monitor your Apify storage usage to ensure you have enough capacity for the extracted data.

Deployment

You can deploy this Actor to the Apify platform using the following steps:

  1. Log in to your Apify account:

    apify login
  2. Deploy your Actor:

    apify push

This will deploy and build the Actor on the Apify Platform. You can find your newly created Actor under Actors -> My Actors.

Resources

For more information on developing with Apify and Crawlee, check out these resources:

By following these steps and customizing the input as needed, you can easily use this Actor to extract data from Redbubble based on your specific search terms.

Developer
Maintained by Community
Actor metrics
  • 2 monthly users
  • 0 stars
  • Created in Jun 2024
  • Modified 21 days ago