Pricing

from $2.99 / 1,000 results

Reddit Comments Scraper

🔎 Extract valuable Reddit comments with this Comments Scraper—fast, accurate, and built for research, sentiment, and community insights. 📊✨ Perfect for marketers, analysts, and data teams wanting actionable results.

Pricing

from $2.99 / 1,000 results

Rating

0.0

(0)

Developer

SolidScraper

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Reddit Comments Scraper 📣

Reddit Comments Scraper automatically collects comments (including nested replies, when enabled) from one or more Reddit posts and returns a flat record per comment, complete with path and depth metadata. If you’re looking to scrape reddit comments, extract reddit thread comments for analysis, or build a bulk reddit comment scraper workflow, this tool helps you get structured comment data at scale—without manually copying threads one by one. Whether you’re a marketer, data analyst, researcher, or developer, you can use this reddit comments extraction actor to speed up collection and save you hours of manual work.

Why choose Reddit Comments Scraper?

Feature	Benefit
✅ Comments + Nested Replies Collection	Extracts top-level comments and (optionally) the full reply tree for each post
✅ All-in-One Batch Input	Lets you scrape comments from multiple post URLs in a single run
✅ Reliable Scraping with Fallback Logic	Includes retries and handles access challenges using a real browser session
✅ Proxy Support for Stability	Supports configurable proxy settings to improve scraping reliability
✅ Structured Flat Output	Returns one JSON record per comment with path/depth metadata for easy downstream processing
✅ Scales with Concurrency Controls	Uses configurable parallelism via maximum concurrent posts to fit your throughput needs

Key features

📊 Flat comment data with tree metadata: Produces one record per comment with commentPath and commentDepth so you can analyze conversation structure.
💬 Optional nested reply extraction: When enabled, replies to comments are also collected (full thread tree); when disabled, only top-level comments are returned.
🔍 Sort-controlled comment ordering: Supports top, best, new, controversial, old, and qa sorting to match your research needs.
🧠 Top-level vs reply awareness: Adds isTopLevel and parentPath so you can distinguish roots from replies in your analysis.
🛡️ Resilient runs with retries: Uses multiple attempts per post to reduce the chance of partial failures.
🌐 Post URL support: Accepts one or more Reddit post URLs and normalizes them for collection.
💾 Dataset-ready results: Pushes extracted comment records to the Apify dataset as JSON (one item list per successful post).
⚙️ Concurrency controls: Uses maxConcurrentPosts so you can balance speed against memory usage.

Input

Provide input via an input.json file. Example structure:

{
  "postUrls": [
    "https://www.reddit.com/r/AskMec/comments/14990m6/les_applications_de_rencontres_fonctionnent_telles/"
  ],
  "maxComments": 500,
  "includeNestedReplies": true,
  "sortBy": "top",
  "maxConcurrentPosts": 2,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

Input Fields

Field	Required	Description
`postUrls`	Yes	One or more Reddit post URLs to scrape comments from.
`maxComments`	No	Maximum number of comments to extract per post (counts nested replies too). Default is `500`. Must be at least `1`.
`includeNestedReplies`	No	When enabled, replies to comments are also extracted (the full thread tree). When disabled, only top-level comments are returned. Default is `true`.
`sortBy`	No	How Reddit should order the comments before they are collected. Options: `top`, `best`, `new`, `controversial`, `old`, `qa`. Default is `top`.
`maxConcurrentPosts`	No	How many posts to scrape in parallel. Each post runs its own browser, so higher values need more memory. Default is `2` (min `1`, max `10`).
`proxyConfiguration`	No	Proxy settings for the scraper. If provided, the actor uses your configuration; otherwise it creates a default proxy configuration with residential groups.

Output

The actor saves extracted comments in JSON format by pushing a list of comment records to the Apify dataset (charged_event_name="result") for each successfully processed post.

Example output record:

[
  {
    "postUrl": "https://www.reddit.com/r/.../comments/.../",
    "postTitle": "Example post title",
    "postAuthor": "example_author",
    "postScore": 12345,
    "subreddit": "examplesubreddit",
    "commentDepth": 0,
    "commentAuthor": "comment_author",
    "commentText": "This is a comment body.",
    "commentTimestamp": "2024-01-15T10:22:33.000Z",
    "commentPath": "0",
    "parentPath": null,
    "isTopLevel": true,
    "replyCount": 2,
    "scrapedAt": "2024-01-15T10:30:00.000Z"
  }
]

Output Fields

Field	Type	Description
`postUrl`	string	The normalized Reddit post URL for which the comment was scraped.
`postTitle`	string	The post title.
`postAuthor`	string	The post author username.
`postScore`	number	The post score at the time of collection.
`subreddit`	string	The subreddit name.
`commentDepth`	number	Depth of the comment in the thread tree (top-level is `0`).
`commentAuthor`	string	The comment author username.
`commentText`	string	The comment body text.
`commentTimestamp`	string	UTC timestamp (ISO-8601 with milliseconds and trailing `Z`) for when the comment was created.
`commentPath`	string	Encoded position of the comment within the tree (e.g., `"0"`, `"0/1"`, `"0/1/0"`).
`parentPath`	string \| null	The parent comment’s `commentPath` (or `null` for top-level comments).
`isTopLevel`	boolean	`true` when `commentDepth` is `0`; otherwise `false`.
`replyCount`	number	Count of direct replies to this comment.
`scrapedAt`	string	UTC timestamp (ISO-8601 with milliseconds and trailing `Z`) indicating when the scraping happened.
`error_message`	string	Not provided in the dataset schema emitted by this actor. Failures are logged and posts that succeed will push records.

You can export the resulting dataset from Apify as JSON or CSV (depending on your chosen export settings in the Apify UI).

How to use Reddit Comments Scraper (via Apify Console)

Open Apify Console: Go to console.apify.com and log in.
Find the actor: Search for Reddit Comments Scraper in the Actors marketplace and open the actor page.
Open the INPUT panel: In the actor run screen, locate the INPUT section.
Add your post URLs: Paste one or more Reddit post URLs into postUrls.
Choose your comment limits and structure:
Set maxComments (per post), enable/disable includeNestedReplies, and pick sortBy if you need a specific ordering.
Set concurrency for your budget: Adjust maxConcurrentPosts (each parallel post uses its own browser, so higher values use more memory).
Configure proxy (optional): If you have proxyConfiguration, add it; otherwise the actor creates a default residential proxy configuration.
Run & monitor: Click Run. Watch logs for progress, extraction counts, and any retry attempts.
Open the OUTPUT dataset: After completion, go to the dataset tab to preview the extracted reddit comments data and export it to JSON/CSV.

No coding required—get reddit comments extraction results in minutes.

Advanced features & SEO optimization

🔁 Engineered for “Reddit Comments Scraper” workflows: The actor is designed for reddit comments to csv scraper style pipelines where you need a clean, flat structure for analysis and BI.
🧩 Thread-aware output for conversation mining: Each comment includes commentPath, parentPath, commentDepth, replyCount, and isTopLevel, making reddit comments mining and scrape reddit thread comments workflows much easier.
🕒 Consistent UTC timestamps: Uses ISO-8601 scrapedAt and commentTimestamp values for reliable time-based analysis.
🧰 Input-friendly sorting: With sortBy, you can align collection with your research question (for example, focusing on most upvoted or most recent discussions).
🔍 Resilience for public web data: Includes retries and supports configurable proxy settings for stable scraping of publicly available data.

Best use cases

📈 Marketing teams: Collect reddit comments data from multiple posts to find recurring themes and messaging angles for outreach campaigns.
🧠 Researchers: Gather structured reddit comments extraction for qualitative coding and quantifying sentiment or discussion depth.
💬 Community managers: Monitor how conversations evolve by scraping threads with sortBy and analyzing commentDepth distributions.
🏗️ Data analysts: Build a conversation graph using commentPath, parentPath, and replyCount from a bulk reddit comment scraper run.
🧪 Product teams: Compare feedback across communities by scraping reddit comments from posts in relevant subreddits and exporting to CSV.
💻 Developer pipelines: Feed structured results into downstream systems (ETL, dashboards, or CRM enrichment steps) with predictable fields per comment.
🎯 Content strategists: Scrape comments from posts to identify what users actually respond to—then iterate your content based on real discussion threads.

Technical specifications

Supported Input Formats
- ✅ postUrls: array of Reddit post URLs
- ✅ maxComments: integer (default 500, minimum 1)
- ✅ includeNestedReplies: boolean (default true)
- ✅ sortBy: string enum (top, best, new, controversial, old, qa)
- ✅ maxConcurrentPosts: integer (default 2, range 1 to 10)
- ✅ Optional proxyConfiguration
Proxy Support
- ✅ Configurable proxy support via proxyConfiguration
- ✅ Default residential proxy configuration when proxyConfiguration is not provided
Retry Mechanism
- ✅ Retries are built in for each post (multiple attempts per post)
Dataset Structure
- ✅ JSON records pushed to the dataset with one flat record per comment
- ✅ Includes commentPath/parentPath/commentDepth for thread reconstruction
Rate Limits & Performance
- ✅ Designed for batch processing with configurable concurrency using maxConcurrentPosts
- ⚠️ Each concurrent post uses its own browser session, so higher concurrency can increase memory usage
Limitations
- ❌ Mod/bot-pinned comments are skipped (stickied items are not included)
- ❌ Only publicly accessible comment data from the provided posts is collected

FAQ

What does Reddit Comments Scraper return?

✅ It returns a flat list of JSON records—one record per comment—with thread metadata like commentPath, parentPath, and commentDepth, plus comment content (commentText) and timestamps (commentTimestamp).

Can it scrape nested replies?

✅ Yes. With includeNestedReplies enabled, replies to comments are also extracted so you get the full thread tree. If you disable it, only top-level comments are returned.

How many comments can I extract from each post?

You control it with maxComments. It sets the maximum number of comments extracted per post and counts nested replies too.

Can I control the order of comments?

✅ Yes. Use sortBy to choose how comments are ordered before they are collected: top, best, new, controversial, old, or qa.

Does it support scraping multiple Reddit posts at once?

✅ Yes. Provide multiple links in postUrls. You can also control parallelism with maxConcurrentPosts to balance speed and resource usage.

Is there a dataset export format other than JSON?

Apify datasets can be exported after the run. The actor pushes JSON-formatted records to the dataset, and you can export to CSV from the Apify UI depending on your settings.

Do I need to use a proxy?

❌ You don’t have to, but you can. If you provide proxyConfiguration, the actor will use it; otherwise it creates a default residential proxy configuration to improve scraping reliability.

Is this compliant with privacy rules?

✅ The actor only collects data from publicly accessible sources. You’re responsible for using the results in accordance with applicable laws (including privacy and platform rules) for your specific use case.

Support & feature requests

If you’re using Reddit Comments Scraper for reddit comments web scraper or reddit comments data extraction workflows, we’d love to hear how it’s working for you.

💡 Feature Requests: Examples include additional export controls, adding more post-level metadata fields, or enhancements tailored for bulk reddit comments mining pipelines.
📧 Contact: For questions, support, or feedback, reach out at dataforleads@gmail.com.

Your feedback helps shape the roadmap for this reddit comment scraper tool.

Use the Reddit Comments Scraper to collect reddit comments extraction results with structured, thread-aware output—so you can scale analysis without the manual grind.

Disclaimer

This tool only accesses publicly accessible sources. It does not access private profiles, authenticated data, or password-protected content.

You are responsible for ensuring your use complies with applicable laws (for example, GDPR/CCPA), spam regulations, and the relevant platform terms of service. For data removal requests, contact dataforleads@gmail.com. Always use Reddit Comments Scraper responsibly, ethically, and for legitimate purposes.

Reddit Comments Scraper

scrapers-hub/reddit-comments-scraper

Reddit comments scraper to extract comments, replies, and user data from Reddit threads 💬📊 Perfect for sentiment analysis, research, and audience insights. Fast, accurate, and scalable data extraction.

Scrapers Hub

Reddit Comments Scraper

scrapecraze/reddit-comments-scraper

🔍 Reddit Comments Scraper pulls insightful Reddit comment threads fast—clean, structured data for sentiment, trend & community analysis. 🧠 Great for research, marketing insights, and competitive intelligence. 🚀 Easy to run, export-ready results.

ScrapeCraze

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit Post Comments Scraper

apiharvest/reddit-post-comments-scraper

Reddit Post Comments Scraper

APIHarvest

Reddit Comments Search Scraper

scrapio/reddit-comments-search-scraper

🔎 Reddit Comments Search Scraper extracts relevant discussions from Reddit comments fast. 📣 Perfect for social listening, brand research, competitor insights, and trend spotting. Save time, get searchable data, and analyze sentiment with ease.

Scrapio

Reddit Comments Scraper

scraper-engine/reddit-comments-scraper

Reddit Comments Scraper extracts comments from Reddit posts with full context. Collect comment text, authors, scores, timestamps, and reply depth for sentiment analysis, research, moderation, and data-driven content insights.

Scraper Engine

Reddit User Profile Posts Comments Scraper

solid-scraper/reddit-user-profile-posts-comments-scraper

🚀 Scrape Reddit user profiles, comments, and posts with filters for keywords, subreddits, and engagement signals. Perfect for market research, influencer discovery, and community analytics—get actionable data fast. 📈

SolidScraper

Reddit Comments Scraper Pro

getdataforme/reddit-comments-scraper-pro

Reddit Comments Scraper Pro efficiently extracts comments, authors, and timestamps from Reddit posts....

GetDataForMe

Reddit User Profile Posts Comments Scraper

scrapecraze/reddit-user-profile-posts-comments-scraper

🔎 Reddit User Profile Posts & Comments Scraper extracts detailed user posts, comments, and profile insights. 📈 Perfect for market research, community analysis, and competitive intelligence—fast, accurate, and built for data-driven decisions.

ScrapeCraze

Reddit Comment Scraper

scrapium/reddit-comment-scraper

Scrape Reddit comments with ease 💬👽 Extract comment text, usernames, scores, timestamps, replies, and thread details from Reddit posts. Perfect for sentiment analysis, audience research, trend tracking, and community insights. Turn Reddit conversations into actionable data fast 🚀