Pricing

$19.99/month + usage

Reddit Comment Scraper

🧰 Reddit Comment Scraper (reddit-comment-scraper) collects Reddit comments & threads across subreddits — with author, score, timestamps, permalinks & nesting. 📊 Export CSV/JSON for research, sentiment, brand monitoring & SEO. ⚡ Ideal for analysts, marketers & community teams.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapeMesh

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Reddit Comment Scraper

The Reddit Comment Scraper is a production-ready Apify actor that collects structured comments from Reddit post URLs — fast, reliable, and built for scale. It solves the hassle of manually navigating threads by turning any Reddit discussion into clean, analyzable records with authors, scores, permalinks, parent/child relationships, and nested replies. Whether you’re a marketer, developer, data analyst, or researcher, this reddit comments scraper tool helps you scrape reddit comments and export them to a usable reddit comment dataset for insights, NLP, and reporting at scale. Think of it as a Reddit thread comment scraper and reddit comment extractor optimized for workflow automation and data accuracy.

What data / output can you get?

Below are the exact fields pushed to the Apify dataset for each comment. You can export results as JSON or CSV from the Apify dataset UI.

Data field	Description	Example value
url	The original Reddit post URL the comment belongs to	https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/
comment_id	Unique comment identifier	lhk1f7n
post_id	Reddit post thing ID (t3_…)	t3_1epeshq
author	Comment author username (or “[deleted]”)	AutoModerator
permalink	Direct link to the specific comment	https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/
upvotes	Number of upvotes (score)	42
content_type	Content type label	text
parent_id	Parent comment ID without prefix (null for top-level)	lhk1f7n
author_avatar	Author avatar URL (if available; empty string otherwise)
userUrl	Link to the user’s Reddit profile (empty if deleted)	https://www.reddit.com/user/AutoModerator/
contentText	The comment text content, line breaks normalized	This is a comment…
created_time	Created timestamp placeholder (empty string if unavailable)
replies	Array of nested reply objects (same schema), trimmed by replyLimit	[ … ]

Notes:

Each dataset item represents one comment and includes a nested replies array (which can be limited by replyLimit). All discovered comments are also emitted as individual flat records.
You can export reddit comments to CSV or JSON directly from the Apify dataset.
The actor also stores a grouped “by URL” structure in the key-value store under the OUTPUT key for convenience.

Key features

⚡️ Robust proxy fallback Automatically tries a direct connection, then datacenter proxy, then residential proxy with retries to keep your reddit comments crawler running even under blocks.
🧵 Nested conversation structure Captures parent/child relationships with a nested replies array per comment. Control how many replies are stored via replyLimit while still collecting all comments in the flat output.
📦 Bulk URL processing Process multiple Reddit post URLs in one run to build a larger reddit comment dataset efficiently.
💾 Clean, structured output Pushes consistent JSON records to the Apify dataset with author, score, permalinks, parent IDs, and more — perfect for analysis, NLP, and reporting.
🚫 No login or cookies required Works against public Reddit JSON endpoints; no authentication needed for scraping public threads.
🔁 Production-ready reliability Async HTTP requests, progress logging (e.g., “Collected N comments so far”), and defensive deduplication ensure dependable runs at scale.
🧰 Developer-friendly Built as a Python reddit comment scraper on Apify. Access results via the Apify API, and integrate into pipelines for automated reddit comment downloader workflows.

How to use Reddit Comment Scraper - step by step

Create or log in to your Apify account.
Open the “reddit-comment-scraper” actor in the Apify Console.
Add your Reddit post URLs in startUrls (e.g., https://www.reddit.com/r/subreddit/comments/post_id/title/). The input accepts a string list; each item can be a full post URL.
Configure limits:
- Set maxComments (1–10,000) to control how many comments to collect per URL.
- Set replyLimit (0–100) to control how many nested replies are stored per comment (0 means unlimited).
(Optional) Configure proxyConfiguration. By default, no proxy is used. If Reddit rejects requests, the actor will automatically fall back to datacenter and then residential proxies with retries.
Click Run. Watch progress logs as comments are collected and expanded via Reddit’s morechildren endpoint.
Download results. Go to the Dataset tab to export JSON or CSV. A grouped-by-URL JSON is also saved under the Key-Value Store as OUTPUT.

Pro tip: Use the Apify API to pull dataset items programmatically and feed them into your analytics or enrichment workflow.

Use cases

Use case	Description
Market research + topic analysis	Aggregate and analyze discussion threads to quantify sentiment and themes across public posts.
Brand monitoring + community insights	Track brand mentions and extract replies to understand user feedback within specific threads.
Content research + editorial	Compile user perspectives from targeted discussions to inform articles and summaries.
Data science + NLP training	Build a structured reddit comment dataset with parent/child context for modeling and classification.
Academic research + social analysis	Study public discourse patterns using thread-level structures and upvote signals.
Developer pipelines (API)	Use the Apify API to automate scrape reddit comments workflows and feed data into ETL/ELT pipelines.

Why choose Reddit Comment Scraper?

This actor prioritizes precision, automation, and reliability over brittle browser extensions or ad-hoc scripts.

✅ Accurate, structured fields for every comment, including parent/child links
🌐 No login required — collects publicly available thread data
📈 Scales across many URLs with async requests and intelligent deduplication
🧪 Developer-first design — Python-based, API-friendly, automation-ready
🛡️ Resilient proxy fallback (direct → datacenter → residential) to reduce blocks
💾 Easy exports (CSV/JSON) and grouped output for downstream processing
🧭 Better than unstable alternatives — production-ready infrastructure on Apify

In short, it’s a reliable reddit API comments scraper that turns threads into analytics-ready data, fast.

Is it legal / ethical to use Reddit Comment Scraper?

Yes — when done responsibly. This actor scrapes only publicly available Reddit content and does not access private or password-protected data.

Guidelines for compliant use:

Collect only public data and respect Reddit’s Terms of Service.
Avoid scraping private communities or content behind authentication.
Ensure your use complies with data protection laws (e.g., GDPR, CCPA).
Use the data responsibly (e.g., analysis, research) and avoid spam or misuse.
Consult your legal team if you have edge cases or questions.

Input parameters & output format

Example JSON input

{
  "startUrls": [
    "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/"
  ],
  "maxComments": 1000,
  "replyLimit": 0,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

Input fields

startUrls (array, required): List one or more Reddit post URLs (e.g., https://www.reddit.com/r/subreddit/comments/post_id/title/).
- Default: none
maxComments (integer, optional): Maximum number of comments to fetch per URL.
- Range: 1–10,000
- Default: 1000
replyLimit (integer, optional): Maximum number of replies to store per comment in the nested replies field. Set to 0 for unlimited. (All replies are still collected in the flattened output.)
- Range: 0–100
- Default: 0
proxyConfiguration (object, optional): Choose which proxies to use. By default, no proxy is used. If Reddit rejects or blocks the request, it will fall back to datacenter proxy, then residential proxy with retries.
- Default: { "useApifyProxy": false }

Example dataset record (single comment)

{
  "url": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/",
  "comment_id": "lhk1f7n",
  "post_id": "t3_1epeshq",
  "author": "AutoModerator",
  "permalink": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/",
  "upvotes": 1,
  "content_type": "text",
  "parent_id": null,
  "author_avatar": "",
  "userUrl": "https://www.reddit.com/user/AutoModerator/",
  "contentText": "Comment text here...",
  "created_time": "",
  "replies": []
}

Grouped output saved to Key-Value Store (key: OUTPUT)

{
  "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/": [
    {
      "comment_id": "lhk1f7n",
      "post_id": "t3_1epeshq",
      "author": "AutoModerator",
      "permalink": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/",
      "upvotes": 1,
      "content_type": "text",
      "parent_id": "1epeshq",
      "author_avatar": "",
      "userUrl": "https://www.reddit.com/user/AutoModerator/",
      "contentText": "Comment text here...",
      "created_time": "",
      "replies": []
    }
  ]
}

Notes:

created_time and author_avatar may be empty strings when not present in Reddit’s JSON.
parent_id is null for top-level comments and contains the normalized parent ID for replies (prefixes like t1_/t3_ are stripped).

FAQ

Is there a free trial?

Yes. The actor offers trial minutes on Apify so you can test before subscribing. You’ll see current trial availability and pricing on the actor’s Apify Store page.

Do I need to log in or provide cookies?

No. The scraper works with public Reddit JSON endpoints and does not require authentication. It fetches publicly available comments only.

How many comments can I scrape per URL?

You can set maxComments from 1 to 10,000 per URL. The actor will collect up to this limit, expanding “more” placeholders via the Reddit API.

Can it scrape nested replies?

Yes. Nested replies are traversed and included. The replyLimit parameter controls how many replies are stored in the replies array per comment (0 means unlimited). All discovered replies are still emitted as individual flat records.

What happens if Reddit blocks the requests?

The actor automatically falls back from a direct connection to a datacenter proxy and then to a residential proxy with retries. This increases resilience during scraping.

Can I export results to CSV?

Yes. All comments are stored in the Apify dataset, which supports exports to JSON, CSV, and more. You can also access records via the Apify API to build pipelines.

Does it work for private subreddits or deleted comments?

No. It collects only publicly accessible content. Deleted or removed comments will appear as “[deleted]” where applicable.

Can I integrate this with Python or APIs?

Yes. This is a Python-based actor on Apify. You can pull dataset items via the Apify API or integrate into your own python reddit comment scraper workflows and automation stacks.

Closing CTA / Final thoughts

The Reddit Comment Scraper is built for teams that need accurate, scalable extraction of Reddit thread comments. It delivers structured records with authors, scores, permalinks, and nested replies — ideal for market research, analytics, and NLP.

Marketers, developers, data analysts, and researchers can export reddit comments to CSV/JSON, automate runs via the Apify API, and build a reliable reddit comments crawler into their pipelines. Start turning public Reddit discussions into actionable datasets — quickly, safely, and at scale.

Reddit Comment Scraper

scrapeflow/reddit-comment-scraper

💬 Reddit Comment Scraper (reddit-comment-scraper) extracts comments from posts and subreddits — with author, score, timestamp, nesting & permalinks. 📊 Export CSV/JSON. 🔍 Ideal for market research, social listening, brand monitoring & academic analysis. 🚀 Fast, scalable.

ScrapeFlow

Reddit Comment Scraper

scrapapi/reddit-comment-scraper

🔎 Reddit Comment Scraper (reddit-comment-scraper) scrapes comments from threads and subreddits — with timestamps, authors, scores, and permalinks. 📈 Export to CSV/JSON for sentiment, keyword, and trend analysis. ⚡ Ideal for market research, community insights, and competitive intelligence.

ScrapAPI

Reddit Comment Scraper

scraply/reddit-comment-scraper

💬 Reddit Comment Scraper (reddit-comment-scraper) captures comments from posts & subreddits—text, authors, scores, timestamps, permalinks & nesting. 🔎 Export CSV/JSON for research, social listening, sentiment & trend analysis. ⚡ Fast, reliable, API-ready.

Scraply

Reddit Comment Scraper

scraperx/reddit-comment-scraper

ScraperX

Reddit Comment Scraper

scraperforge/reddit-comment-scraper

ScraperForge

Reddit Comment Scraper

scrapepilotapi/reddit-comment-scraper

ScrapePilot

Reddit Comment Scraper

api-empire/reddit-comment-scraper

Automatically extract Reddit comments and replies using Reddit Comments Scraper. Collect comment text, usernames, scores, and timestamps to support sentiment tracking, brand monitoring, and community research.

API Empire

Reddit Api Scraper

scrapapi/reddit-api-scraper

Reddit API Scraper collects data from Reddit posts, comments, and subreddits using the Reddit API. Extract titles, post text, usernames, scores, timestamps, and comment threads. Ideal for trend analysis, sentiment research, community monitoring, and social data collection.

ScrapAPI

Reddit Comment Scraper

scrapeengine/reddit-comment-scraper

🧵 Reddit Comment Scraper extracts comments from posts & threads — author, text, score, timestamps, IDs & permalinks. 🔎 Filter by subreddit, keyword or time, export to CSV/JSON. 🚀 Perfect for social listening, sentiment analysis, market research & competitive intel.

ScrapeEngine