Pricing

Pay per usage

Rumble Category Scraper

Walks /category/{slug}/videos pages and extracts unique channel URLs until target volume is hit, the pagination 404s, or a safety cap on pages is reached.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Fiodar Tarasenka

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

infla/rumble-category-scraper

Walks https://rumble.com/category/{slug}/videos paginated listing pages and extracts unique channel URLs. Powers Infla's rumble_category job type (PLAN-009).

Input

Field	Type	Default	Description
`category`	string (required)	—	Rumble category slug (`crypto`, `news`, `finance`, …).
`mode`	enum (`channels` \| `videos`)	`channels`	Output shape. See Videos mode (PLAN-011) below.
`targetUniqueCreators`	integer	100	Stop after this many unique channels. In videos mode this caps the number of distinct creators emitted.
`maxVideosPerCreator`	integer	5	Videos-mode only: cap on videos emitted per creator. Range 1-50. Ignored in channels mode.
`maxPages`	integer	50	Safety cap on pagination depth.

Output

Channels mode (default)

One dataset item per unique channel:

{
    "mode": "channels",
    "channelHandle": "russellbrand",
    "channelURL": "https://rumble.com/c/RussellBrand",
    "page": 1,
    "position": 3
}

channelHandle is lowercased so callers can dedupe across runs without re-normalising. page + position preserve discovery order.

Videos mode (PLAN-011)

When mode: "videos", the actor emits one dataset item per video card instead of per channel, with the parent channel handle attached so the Go side can group by handle downstream. This is how Infla's PLAN-011 door-knock + content-only outreach pipelines get pre-fetched video context without an extra per-creator round-trip to Rumble.

Use channels mode when you only need a list of creators to enrich later via a separate per-channel scrape. Use videos mode when you want the listing-page video data inline — saves one round-trip per creator and is the default for Infla's rumble_category jobs after PLAN-011.

Per-item shape in videos mode:

{
    "mode": "videos",
    "channelHandle": "philgodlewski",
    "channelURL": "https://rumble.com/c/philgodlewski",
    "videoURL": "https://rumble.com/v123-some-slug.html",
    "videoExternalID": "v123",
    "title": "Title from the listing card",
    "durationSeconds": 1234,
    "publishedAt": "2026-05-30T00:00:00Z",
    "viewCount": 1234,
    "thumbnailURL": "https://.../thumbnail.jpg",
    "isShort": false,
    "isLive": false,
    "page": 1,
    "position": 7
}

Field notes:

videoExternalID is the Rumble permalink slug (e.g. v123 from /v123-some-slug.html). Stable across the site and usable as a primary key on the Go side.
durationSeconds parses both mm:ss and hh:mm:ss card text. 0 when the card doesn't expose a duration (e.g. live streams).
viewCount handles 1.2K / 3M / 4.5B abbreviations. 0 on parse failure.
publishedAt converts relative dates ("2 days ago", "3 hours ago", "1 month ago") to ISO 8601 UTC at midnight. null when the card has no date or it's in an unrecognised shape — listing cards don't carry sub-day precision so midnight is the honest anchor.
isShort / isLive are best-effort badge detection. Default false when the badge isn't present.

Stop semantics in videos mode: the crawl halts when the number of distinct creators reaches targetUniqueCreators (matching channels-mode "unique creators reached"), or pagination 4xx, or maxPages. Per-creator cap is enforced by maxVideosPerCreator but does NOT contribute to the stop condition.

Termination

The actor stops on the first of:

targetUniqueCreators unique handles collected
Next page returns a non-2xx response (pagination exhausted)
maxPages reached

Local development

cd apify-actors/rumble-category-scraper
npm install
# npm install runs `playwright install chromium` via postinstall;
# if the postinstall step is skipped (e.g. CI), run it manually:
#   npx playwright install chromium

mkdir -p .actor
echo '{"category":"crypto","targetUniqueCreators":20,"maxPages":3}' > .actor/INPUT.json
apify run --input-file=.actor/INPUT.json --purge

apify run --purge (Apify CLI v0.x+) wipes the local key-value store and dataset between runs so each invocation starts clean.

Cloudflare on Rumble category pages

Rumble protects category listing pages with Cloudflare's JS challenge — every raw HTTP fetch is met with HTTP 403 from CF's edge. The actor uses PlaywrightCrawler with a real headless Chromium so the challenge resolves automatically. CheerioCrawler (raw HTTP) was tried first and confirmed not viable for this target.

Local runs without an Apify token use direct connections — your laptop's IP solves the challenge once and Chromium reuses the clearance cookie across pages. If your IP has been flagged by CF for any reason, you'll see persistent 403s; either rotate IPs or test on the cloud (apify push + run from the console).

On Apify cloud, Actor.createProxyConfiguration({groups:['RESIDENTIAL']}) is requested. RESIDENTIAL requires a paid Apify plan; on free tier the call falls back to datacenter proxy, which is sufficient for most CF challenges on listing pages.

Deploy

Requires the Apify CLI (npm i -g apify-cli) and an Apify account with apify login completed.

cd apify-actors/rumble-category-scraper
apify push

apify push reads .actor/actor.json, builds the image remotely, and publishes a new actor version under your account. The actor will appear in the Apify Console with the technical name from actor.json (rumble-category-scraper). The full ID is <your-username>/rumble-category-scraper.

After deploy:

Open the actor in the Apify Console and copy its ID (username/rumble-category-scraper).
Set APIFY_ACTOR_RUMBLE_CATEGORY=<your-username>/rumble-category-scraper in Infla's .env.production (and reload the app container).
Infla's discovery worker reads cfg.ApifyActorRumbleCategory when dispatching a rumble_category job.

Design notes

PlaywrightCrawler over CheerioCrawler. Rumble's category pages are gated by Cloudflare's JS challenge. A real Chromium resolves the challenge transparently; raw HTTP (CheerioCrawler) is met with 403 on every request. Cost: a Playwright run is ~3-5x the compute units of a Cheerio run, but at one actor run per discovery job this is still <$0.50 of operator-visible cost.
maxConcurrency: 1. Sequential pagination keeps the per-page log line meaningful and avoids saturating Rumble's anti-bot. The expected target is 100-300 creators which finishes in well under a minute even sequentially.
No retries on 404. Rumble's pagination doesn't expose a total-pages count; the canonical "you've reached the end" signal is a 404 on the next page request. The actor treats failed requests as the stop signal and exits cleanly.
Conservative selectors. Match a[href^="/c/"], /user/, and /channel/. Three prefixes cover every Rumble channel-URL scheme observed during PLAN-009 research. New schemes (if Rumble adds any) would need a code change here; the conservative approach beats a regex that might catch outbound advertising links.
useFingerprints: true. Apify's browser pool rotates user- agent + canvas/font fingerprints across sessions, which combined with the Playwright JS-challenge solver is the canonical CF bypass recipe documented by Apify itself.

Schema version

This actor's output JSON is consumed by internal/services/discovery/rumble_category.go (PLAN-009 C10). Any output-field rename requires a coordinated change there.

📚 Goodreads Book Scraper

scraperx/goodreads-book-scraper

📚 Scrapes Goodreads for books by search term or search URL. 📖 Extracts title, author, rating, ratings count, published, editions, book URL, and cover URL. 🔄 Pagination is automatic—keeps fetching pages until the requested number of books per query is reached or no more results exist. ⚡ Starts...

ScraperX

Hotels Reviews Scraper

autosoldier/hotels-reviews-scraper

This Apify Actor is a high-performance scraper designed to extract Hotels/Resorts reviews from popular sites like Expedia, Booking.com, Tripadvisor and more. It scrapes all review pages until the very last review is reached, ensuring a 100% complete dataset regardless of the total review count.

datascrapepro

Rumble Channel Content Scraper - Cheap 🌐📊📺

scrapestorm/rumble-channel-content-scraper---cheap

🔎 Easily collect videos, shorts, livestreams and playlists from Rumble Provide one or multiple Rumble channel or user profile URLs and extract content data such as 📺 Video Title 💬 Comments 🔗 Video URL & more Perfect for creator research, influencer analysis & video dataset collection 📊🚀

Storm_Scraper

5.0

Amazon Complete Reviews Scraper

autosoldier/amazon-complete-reviews-scraper

Extract reviews from page 1 to the last. This is a high-performance scraper designed to extract Amazon product reviews. Unlike standard scrapers, it supports persistent user sessions, allowing you to scrape all review pages until the very last review is reached, ensuring a 100% complete dataset.

datascrapepro

1.9

Rumble Email Scraper - Advanced, Fast & Cheapest

contacts-api/rumble-email-scraper-fast-advanced-and-cheapest

📺 Rumble Email Scraper helps you collect creator and channel emails from Rumble profiles fast 🔍 Scale video outreach and sponsorships 📧

Lead Heaven

LinkedIn People Profile Activity Cookie-less 🍪 ✅

datamagnet/linkedin-people-profile-activity

Fetch recent LinkedIn activity for a profile in a clean, easy-to-read format. Choose posts, reactions, or comments and get more results automatically until your limit is reached.

Datamagnet

Rumble Channels Scraper - Cheap 📺🔎👥

scrapestorm/rumble-channels-scraper---cheap

🔎 Easily collect Rumble channels by keyword Provide one or multiple search keywords & extract channel intelligence such as 📺 Channel Name ✔ Verified Status 👥 Followers 🏷 Account Type 🔗 URL & more Perfect for creator discovery, influencer research & building creator datasets on Rumble 📊

Storm_Scraper

YouTube Channel Scraper

goat255/youtube-channel-scraper

Scrape a YouTube channel's videos with metadata without a login. Give it channel handles, URLs, or ids and get back each video's title, view count, publish time, duration, and thumbnail, plus an optional channel header row. Walks pagination up to your chosen limit.

Goutam Soni

5.0

Amazon Products By Category Scraper

pintostudio/amazon-products-by-category-scraper

The Amazon Products By Category Actor is a web scraping tool deployed on the Apify platform that extracts product information from Amazon category pages.