Pricing

from $1.20 / 1,000 threads

Reddit Comment Tree Scraper — Full Threads + Scores

Premium: scrape full nested comment trees WITH upvote scores and depth from any Reddit thread or subreddit, using a real browser to get the canonical data RSS can't.

Pricing

from $1.20 / 1,000 threads

Rating

0.0

(0)

Developer

James Taylor

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Why this one

Reddit blocks its .json/API endpoints to ordinary scrapers (you get a 403), which is why cheaper actors fall back to RSS and return comments without scores or nesting. This actor uses a real browser (headless Chromium) through a residential proxy to pass Reddit's anti-bot, then reads the canonical thread data — so you get:

Per-comment upvote scores and post score / upvote ratio / comment count
Full nested structure — each comment's depth and parentId to rebuild the tree
Author, body, timestamp, and permalink for every comment

What it does

Scrapes specific thread URLs, and/or discovers threads from subreddits you name.
Returns one record per thread: the post plus a flat-but-tree-preserving comments[] array (each comment carries depth + parentId, so you can reconstruct the hierarchy).

Input

Field	Type	Default	Description
`postUrls`	array	`[]`	Specific Reddit thread URLs to scrape in full.
`subreddits`	array	`[]`	Discover threads from these subreddits and scrape each.
`sort`	string	`hot`	Sort for subreddit discovery: `hot`/`new`/`rising`/`top`.
`maxPosts`	integer	`25`	Total threads to scrape (caps spend).
`maxComments`	integer	`200`	Cap comments per thread (Reddit serves ~200/page).
`maxConcurrency`	integer	`3`	Parallel browser contexts (kept low — real browsers).
`proxyConfiguration`	object	Apify residential	Required — Reddit blocks datacenter IPs.

Provide postUrls, subreddits, or both.

Example input

{
  "subreddits": ["SaaS"],
  "sort": "hot",
  "maxPosts": 10,
  "maxComments": 200,
  "postUrls": ["https://www.reddit.com/r/Entrepreneur/comments/abc123/some_thread/"]
}

Output

One dataset item per thread:

{
  "type": "post",
  "subreddit": "SaaS",
  "author": "founder_jane",
  "title": "How we cut churn 30%",
  "score": 142,
  "upvoteRatio": 0.97,
  "numComments": 88,
  "commentCount": 200,
  "postUrl": "https://www.reddit.com/r/SaaS/comments/abc123/how_we_cut_churn_30",
  "createdAt": "2026-06-01T12:00:00.000Z",
  "comments": [
    {
      "type": "comment",
      "id": "opcuxfu",
      "postId": "abc123",
      "parentId": "t3_abc123",
      "author": "growth_greg",
      "body": "What did your onboarding look like before?",
      "score": 24,
      "depth": 0,
      "createdAt": "2026-06-01T12:30:00.000Z",
      "url": "https://www.reddit.com/r/SaaS/comments/abc123/_/opcuxfu"
    }
  ]
}

Rebuild the tree from depth + parentId, or use the flat list as-is.

Pricing & cost control

Pay-Per-Event — charged per thread (all of a thread's comments included). This is a premium tier: it runs a real browser through a residential proxy (Reddit hard-blocks datacenter IPs), so it costs more than the RSS-based Reddit Scraper — but it's the only one that returns scores + nested trees. Set maxPosts to cap spend.

Two cost levers:

Bring your own residential proxy. In the proxy input choose Custom proxies and paste your own residential URLs ($1–2/GB) instead of Apify's residential ($8/GB) — typically 3–5× cheaper.
threadsPerSession amortises browser startup: one warmed session fetches many threads' .json before rotating IP, so you mostly pay for the lightweight JSON payloads, not page renders.

Limitations

~200 comments per thread per Reddit page; very large threads truncate the deepest branches (the collapsed "load more" stubs are skipped).
Residential proxy required. Datacenter IPs are blocked.
Slower and pricier than the RSS scraper by design — use that one when you don't need scores/trees.

Compliance

Reads public Reddit data only, identifies itself, and never logs in, posts, votes, or messages. Use the data in line with Reddit's terms and any laws that apply to you.

FAQ

How is this different from your Reddit Scraper? That one is RSS-based — fast and cheap, but no upvote scores and only flat top-level comments. This one uses a real browser to get full nested trees + scores.

Do I need a Reddit account or API key? No — just the (default) residential proxy.

Why a browser? Reddit fingerprints and challenges non-browser clients; a real browser passes, then reads the canonical thread data.

Want this turned into action, not just data?

If you want Reddit conversations turned into leads and AI-drafted replies automatically, that's SignalEngine — this actor is a piece of the engine behind it.

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.

Tatsuya Mizuno

Reddit Scraper

optimus-fulcria/reddit-scraper

Scrape Reddit posts, comments, and subreddit data. Full nested comment threads, search queries, user profiles.

Fulcria Labs

Reddit Scraper – Posts, Full Comment Trees & Users

ninhothedev/reddit-scraper

$1.5/1K 🔥 Fast Reddit scraper! Posts, comments, scores, authors & full comment trees from any subreddit or search. JSON, CSV, Excel or API in seconds. Paste a subreddit or keyword & pull thousands of posts for research & trend-tracking ⚡

ninhothedev

Reddit Post Comments Scraper

electrabot.info/reddit-post-comments

Scrapes full comment trees from Reddit posts — user handles, upvotes, nested replies, and per-comment sentiment.

electra bot

Reddit Comment Scraper

scrapelabsapi/reddit-comment-scraper

ScrapeLabs

Reddit Comment Scraper

scrapebase/reddit-comment-scraper

ScrapeBase

Reddit Comment Scraper

scraperforge/reddit-comment-scraper

ScraperForge

Reddit Comment Scraper

scrapepilotapi/reddit-comment-scraper

ScrapePilot

Reddit Comment Scraper

scraperx/reddit-comment-scraper

ScraperX

Reddit Post & Comment Scraper

fluxcurulin/reddit-scraper

Scrape posts and comments from any subreddit via old.reddit.com. Extract titles, scores, authors, timestamps, comment threads, and nested replies. Ideal for sentiment analysis, trend tracking, brand monitoring, and academic research.