Reddit Post Comments Scraper | $2.5/1K Comments avatar

Reddit Post Comments Scraper | $2.5/1K Comments

Pricing

$2.50 / 1,000 results

Go to Apify Store
Reddit Post Comments Scraper | $2.5/1K Comments

Reddit Post Comments Scraper | $2.5/1K Comments

Extract Reddit post comments data — title, author, engagement, and more. Scrape by keyword, URL or ID. Export to JSON, CSV & Excel, use the API, schedule runs and integrate. No code required.

Pricing

$2.50 / 1,000 results

Rating

0.0

(0)

Developer

Jackie Chen

Jackie Chen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

Reddit Post Comments Scraper

reddit-post-comments

Scrape the full comment forest of any Reddit post. Give one or more post IDs and the Actor returns one clean, structured item per comment: the comment text (markdown), score, author, depth, parent ID, timestamps, and removed / locked / stickied flags. It can optionally follow "load more replies" to expand deep nested threads and attach the post's metadata to every comment.

Unofficial. This Actor is not affiliated with, authorized, or endorsed by Reddit, Inc. It is an independent tool that retrieves publicly available data via a third-party API. Use it in compliance with Reddit's terms and all applicable laws; you are responsible for how you use the retrieved data.

What it does

  • Comment forest — for each post, fetches the comment tree sorted by Best / Top / New / Controversial / Old / Q&A.
  • Nested replies (optional) — follows each thread's "load more replies" cursor to pull deeper comments that aren't in the initial response.
  • Post metadata (optional) — fetches the post's title, subreddit, author, score and comment count once and attaches a post object to every comment item.

Input

FieldTypeDefaultDescription
postIdsstring[]["t3_bawfcs"]Reddit post fullnames (t3_…). A bare id or a full post URL is also accepted.
sortTypeenumCONFIDENCECONFIDENCE (Best) / TOP / NEW / CONTROVERSIAL / OLD / QA.
maxItemsinteger50Max total comments across all posts.
expandRepliesbooleanfalseFollow "load more" cursors to fetch deeper nested replies.
includePostInfobooleanfalseAttach post metadata (title, subreddit, author, score) to each comment.

Example input

{
"postIds": ["t3_bawfcs", "https://www.reddit.com/r/AskReddit/comments/abc123/some_thread/"],
"sortType": "TOP",
"maxItems": 500,
"expandReplies": true,
"includePostInfo": true
}

Output

One dataset item per comment:

{
"id": "t1_ekf5nop",
"postId": "t3_bawfcs",
"parentId": null,
"depth": 0,
"childCount": 12,
"content": "There was one I saw a while ago where it was ...",
"score": 3633,
"author": "Cobalt-Royal",
"authorId": "t2_pr5g8",
"createdAt": "2019-04-08T21:29:42.750000+0000",
"editedAt": "2019-04-09T00:06:28.303000+0000",
"isRemoved": false,
"isLocked": false,
"isStickied": false,
"isArchived": true,
"distinguishedAs": null,
"permalink": "/r/AskReddit/comments/bawfcs/.../ekf5nop/",
"url": "https://www.reddit.com/r/AskReddit/comments/bawfcs/.../ekf5nop/",
"hasMoreReplies": true,
"post": {
"id": "t3_bawfcs",
"title": "What's the creepiest Ask Reddit thread you have come across?",
"subreddit": "r/AskReddit",
"author": null,
"score": null,
"commentCount": 1234
}
}

(post is present only when includePostInfo is enabled.)

Notes

  • Comment IDs are de-duplicated within a run.
  • Data is sourced live; Reddit's edge occasionally rate-limits, so the Actor retries transient blocks with exponential backoff.
  • post_id must reference the t3_ post fullname; the Actor normalizes bare ids and URLs for you.

Quick start

  1. Open the Actor and press Run — the default input works out of the box.
  2. Adjust the input fields below to your target (keywords, IDs, or URLs) and set maxItems to cap spend.
  3. Grab results from the Dataset tab as JSON / CSV / Excel, or pull them via the Apify API and MCP from your own code.

No proxies to configure, no cookies to paste, no login — the Actor handles everything server-side.

Why teams switch to this Reddit comments scraper

Comment trees are where Reddit's real signal lives, but they're the part most scrapers handle worst — browser-based actors time out on big threads and charge $15–20 per 1,000 comments. This Actor walks the thread via a direct HTTP API and returns the tree as flat, parent-linked JSON at $2.50 per 1,000 comments.

What people build with it

  • Sentiment analysis — feed a product-launch thread's comments to an LLM and get a structured verdict on how it landed, weighted by comment score.
  • Voice-of-customer mining — the complaints and praise under reviews and comparison threads are unfiltered product feedback you can't survey for.
  • Conversation datasets — parent-child comment pairs are natural dialogue data for fine-tuning chat models.
  • Crisis monitoring — when a thread about your brand takes off, pull the tree hourly and track which concerns are gaining score.
  • Summarized digests — schedule the Actor + an LLM step to turn daily megathreads into a morning brief.

Tips for better results

  • Accept either the full post URL or the bare post ID.
  • Each comment carries parentId, depth, and score, so you can rebuild the tree, filter to top-level only, or weight by votes.
  • Find the threads worth expanding with Reddit Post Search or Reddit Subreddit Posts.

Why this Actor

  • Direct API, no headless browser — fast, stable runs with nothing to babysit.
  • No login, no cookies — we never touch your accounts, so there's no ban risk.
  • Fresh, real-time data — every run reads the source live, not a stale cache.
  • Pay per result — you're billed only for the rows actually delivered.
  • Structured JSON — export to CSV, Excel, or JSON, or pull straight from the API / MCP.

Use cases

  • Mine audience sentiment and feature requests from real comment threads.
  • Surface the most-liked replies and frequent questions under any post.
  • Build moderation, UGC, or social-listening datasets at scale.
  • Spot superfans and detractors by author and engagement.

FAQ

Do I need an account, cookies, or to log in anywhere? No. The Actor talks to a fast, direct HTTP API server-side — you just provide inputs and run it.

How am I billed? Pay-per-result: a fixed price per row returned, with no separate platform/compute charge. Caps like maxItems keep spend predictable.

Can I run it on a schedule or call it from my app? Yes — use Apify Schedules, the REST API, the JavaScript / Python clients, or the MCP server. See the API tab.

Is this affiliated with Reddit? No. It's an independent tool that collects publicly available data. Use it in line with the platform's terms and applicable law.

More Reddit scrapers by us

Browse the full fleet → https://apify.com/ethereal_wool