Reddit Comment Scraper โ€” Posts, Subreddits & Keywords avatar

Reddit Comment Scraper โ€” Posts, Subreddits & Keywords

Pricing

from $2.00 / 1,000 comment scrapeds

Go to Apify Store
Reddit Comment Scraper โ€” Posts, Subreddits & Keywords

Reddit Comment Scraper โ€” Posts, Subreddits & Keywords

Scrape Reddit comments from any post URL, subreddit feed, or keyword search. Full nested threads, 24 metadata fields per comment, built-in analytics report. No API key, no browser, 512 MB RAM.

Pricing

from $2.00 / 1,000 comment scrapeds

Rating

0.0

(0)

Developer

Yuliia Kulakova

Yuliia Kulakova

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract Reddit comments at scale. Full nested threads, rich metadata, analytics report. No API key, no browser, fits in 512 MB RAM.

Reddit Comment Scraper


๐Ÿš€ What this scraper does

Reddit's comments are gold โ€” for market research, brand sentiment, content ideas, AI training data, social listening. But Reddit's official API is rate-limited, requires authentication, and skips half the metadata you actually want.

This actor gives you everything Reddit shows on the page โ€” in clean structured JSON โ€” without an API key, without spinning up a browser, and at a price that scales.

โœ… Three ways to find comments โ€” by post URL, subreddit feed, or keyword search across all of Reddit โœ… Full nested threads โ€” every reply, every depth level, parent-child relationships preserved โœ… Rich metadata โ€” 24 fields per comment including score, controversiality, awards, edits, deletion status, and more โœ… Built-in analytics report โ€” community insights computed automatically and saved alongside the data โœ… Fast and lean โ€” runs in 512 MB RAM, no headless browser, no slow Selenium-style waits


๐Ÿ’ก Use cases

WhoHow they use it
Market researchersTrack what real users say about products, competitors, pricing
Brand managersMonitor mentions of your brand across thousands of subreddits
Content creatorsFind trending discussions, popular questions, hot takes
AI / ML teamsBuild training datasets of authentic human conversations
SEO specialistsDiscover what your audience actually asks and cares about
InvestorsTrack retail sentiment on stocks, crypto, IPOs in real time
AcademicsSociology, linguistics, political science research

๐Ÿ“ฅ Three ways to give it a target

You can mix-and-match any of these in a single run.

Mode 1 โ€” Direct URLs

Paste any Reddit URL. The scraper figures out whether it's a post, a subreddit feed, or a search page.

{
"startUrls": [
{ "url": "https://www.reddit.com/r/AskReddit/comments/1u003hr/" },
{ "url": "https://www.reddit.com/r/programming/" }
]
}

Mode 2 โ€” Subreddit feeds

Just list the subreddit names. The actor pulls the latest posts and scrapes all their comments.

{
"subreddits": ["AskReddit", "MachineLearning", "investing"],
"postSort": "hot",
"postTime": "week",
"maxPostsPerSource": 25
}

Search across all of Reddit, or within specific subreddits.

{
"keywords": ["chatgpt", "claude code", "anthropic"],
"subreddits": ["LocalLLaMA", "OpenAI"],
"maxPostsPerSource": 10
}

If you only provide keywords without subreddits, the actor searches globally across Reddit.


๐ŸŽ›๏ธ Filters that actually work

Every filter has been tested end-to-end.

FilterWhat it does
minScoreSkip low-quality comments below this upvote count
maxDepthLimit nesting (1 = top-level only, 2 = top + one reply level, 0 = unlimited)
maxCommentsPerPostCap comments per post so a megathread doesn't blow your budget
excludeDeletedDrop [deleted] and [removed] comments
includeNSFWOpt in to NSFW subreddits (default: skipped)
commentSorttop / new / controversial / old / qa / confidence
postSorthot / top / new / relevance / comments / controversial
postTimehour / day / week / month / year / all

๐Ÿ“ค What you get back

One dataset item per comment with 24 fields:

FieldTypeDescription
idstringReddit comment ID
postIdstringID of the parent post
postTitlestringTitle of the parent post
postUrlstringFull URL of the parent post
postScorenumberScore of the parent post
subredditstringe.g. AskReddit
subredditPrefixedstringe.g. r/AskReddit
authorstringReddit username
bodystringComment text (markdown)
scorenumberNet upvotes (upvotes minus downvotes)
controversialitynumber1 if Reddit flagged the comment as controversial
totalAwardsnumberAwards received
depthnumberNesting depth (0 = top-level reply to the post)
parentIdstringID of the parent comment or post
isTopLevelbooleanTrue if this is a direct reply to the post
isSubmitterbooleanTrue if the author is the post's OP
isStickiedbooleanTrue for moderator-pinned comments
distinguishedstring | null"moderator", "admin", or null
isDeletedbooleanTrue if the comment was deleted by the user or removed by moderators
postedAtISO dateWhen the comment was posted
editedAtISO date | nullWhen the comment was last edited (null if never)
scrapedAtISO dateWhen the scraper saw it
permalinkstringDirect link to the comment on Reddit
matchedKeywordstring | nullWhich of your keywords matched (for keyword-search mode)

Example output

{
"id": "oqen70z",
"postId": "1u003hr",
"postTitle": "What movie plot hole is so massive that it completely ruins the story?",
"postUrl": "https://www.reddit.com/r/AskReddit/comments/1u003hr/...",
"postScore": 1300,
"subreddit": "AskReddit",
"subredditPrefixed": "r/AskReddit",
"author": "TheAmazingSealo",
"body": "The Butterfly Effect breaks its own rules and logic...",
"score": 3003,
"controversiality": 0,
"totalAwards": 0,
"depth": 0,
"parentId": "1u003hr",
"isTopLevel": true,
"isSubmitter": false,
"isStickied": false,
"distinguished": null,
"isDeleted": false,
"postedAt": "2026-06-08T07:02:55.000Z",
"editedAt": "2026-06-08T08:10:16.000Z",
"scrapedAt": "2026-06-08T13:21:29.916Z",
"permalink": "https://www.reddit.com/r/AskReddit/comments/1u003hr/.../oqen70z/",
"matchedKeyword": null
}

๐Ÿ“Š Bonus: analytics report

Set includeAnalytics: true (it's on by default) and the actor computes a community insights report alongside the raw data. Saved as ANALYTICS in the run's Key-Value store.

What's inside:

  • ๐Ÿ“ˆ Total posts and comments processed
  • โญ Average comment score
  • ๐Ÿ† Top commenters by volume and by score
  • ๐Ÿ”ฅ Hottest threads (highest engagement)
  • ๐Ÿ˜ก Most controversial threads
  • ๐Ÿ“Š Distribution of comments by depth, score buckets, time-of-day

Perfect for one-glance overviews and dashboards. No extra setup โ€” just one toggle.


โš™๏ธ Quick start

  1. Pick a target. A subreddit name, a post URL, a keyword โ€” anything that interests you.
  2. Set limits. maxPostsPerSource: 5 and maxCommentsPerPost: 100 are reasonable starting values.
  3. Click Run. Results stream into the dataset as they're scraped.

That's it. No tokens to fetch, no proxies to configure (the default works), no schema to learn.


๐Ÿ’ฐ Pricing

Pay only for what you use:

EventPrice
Actor start$0.01 per run
Comment scraped$2.00 per 1,000 comments ($0.002 each)

Example runs:

  • 100 comments โ†’ $0.21
  • 500 comments โ†’ $1.01
  • 5,000 comments โ†’ $10.01
  • 100,000 comments โ†’ $200.01

Apify platform usage (compute, proxy, storage) is billed separately by Apify at standard platform rates. The actor runs in 512 MB RAM with no headless browser, so platform costs stay low.


โ“ FAQ

Do I need a Reddit account or API key? No. The scraper works on Reddit's public data โ€” anything visible without logging in.

Will it get my IP banned? The actor ships configured to use Apify Residential proxies, which rotate IPs automatically. You can override with your own proxy in the input.

How fresh is the data? Real-time. The scraper reads Reddit's live data โ€” comments posted minutes ago are already in the results.

Does it handle deleted comments? Yes. Deleted comments are flagged with isDeleted: true so you keep the tree structure intact. Opt out with excludeDeleted: true.

Can I scrape NSFW subreddits? Yes, but you have to opt in with includeNSFW: true. By default they're skipped.

What about Reddit's "load more comments" buttons? Top comments are always returned. Very deep tail threads (beyond what Reddit returns in one shot) are noted in the log but not auto-expanded by default โ€” increase maxCommentsPerPost to pull more.

Can I run scheduled scrapes? Yes. Use Apify's built-in Schedules. Great for daily brand-mention digests or weekly sentiment snapshots.

Will old.reddit.com / new.reddit.com URLs both work? Both. Also redd.it short links and mobile .compact URLs.

Does the actor work for private subreddits? No โ€” private subreddits require authenticated access. Public and quarantined-but-visible subs work fine.


๐Ÿ› ๏ธ Author

Maintained by brilliant_gum.

Bug reports, feature requests, custom-scrape needs โ†’ open an issue on this actor.

If this saved you time, leave a โญ โ€” it helps other Reddit researchers find the tool.