Pricing

$19.99/month + usage

Reddit Comment Scraper

Scrape Reddit comments with ease 💬👽 Extract comment text, usernames, scores, timestamps, replies, and thread details from Reddit posts. Perfect for sentiment analysis, audience research, trend tracking, and community insights. Turn Reddit conversations into actionable data fast 🚀

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Scrapium

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Reddit Comment Scraper

Reddit Comment Scraper is a Python-based tool that extracts structured comment data from Reddit post threads — a fast, reliable Reddit comment extractor for marketers, developers, analysts, and researchers. It solves the tedious task of trying to scrape Reddit comments by turning sprawling discussions into a clean Reddit comments dataset you can analyze, export, and integrate. Built as a Reddit API comment scraper using public JSON endpoints, it helps you download Reddit comments at scale for sentiment analysis, trend tracking, and audience insights.

What data / output can you get?

Below are the exact fields this Reddit thread comment scraper collects and stores. Each record represents a single comment associated with a post URL.

Data field	Description	Example value
url	The Reddit post URL the comment belongs to (dataset only)	https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/
comment_id	Unique comment identifier	lhk1f7n
post_id	Post identifier (Reddit thing ID, prefixed with t3_)	t3_1epeshq
author	Comment author username; “[deleted]” if removed	AutoModerator
permalink	Direct link to the specific comment	https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/
upvotes	Number of upvotes (score) on the comment	1
content_type	Type of content; set to “text”	text
parent_id	Parent ID if it’s a reply; normalized without prefix or null	t3_1epeshq
author_avatar	Author avatar URL if available (empty string otherwise)	""
userUrl	Link to user’s Reddit profile; empty if “[deleted]”	https://www.reddit.com/user/AutoModerator/
contentText	The plain-text comment (newlines normalized)	Comment text here...
created_time	Created time if available (empty string otherwise)	""
replies	Nested replies captured under each comment (array; see notes)	[]

Notes:

Results are available as a structured dataset (one row per comment) and as a grouped JSON (comments array per URL) saved to the Key-Value Store.
You can export to JSON, CSV, or Excel directly from the Apify dataset.
The “replies” array is stored per comment to reflect thread structure; all comments are also emitted as flattened records.

Key features

🔁 Automatic proxy fallback — Robust access strategy that tries direct connection first, then falls back to datacenter and finally residential proxies with retries for reliability.
📚 Bulk URL processing — Add multiple Reddit post URLs to scrape comments across many threads in a single run.
🧵 Nested replies support — Replies are traversed and emitted as individual records; the per-comment “replies” array is controlled by a configurable replyLimit.
🧱 Structured JSON output — Clean, ready-to-analyze fields including author, text, scores, permalinks, parent IDs, and nested replies.
📦 Dual output formats — Get individual comment records in the Dataset and grouped comments-per-URL in the Key-Value Store (under the OUTPUT key).
🐍 Python-based reliability — Built with aiohttp and the Apify SDK for stability, clear logging, and scalable runs.
🧹 Smart deduplication — Comment IDs are deduplicated defensively to ensure tidy datasets.
📤 Easy exporting — Download Reddit comments as JSON/CSV/Excel from Apify, or fetch programmatically via API to power your Reddit comment scraper tool workflows.

How to use Reddit Comment Scraper - step by step

Create or log in to your Apify account.
Open the Apify Console and navigate to Actors, then find “reddit-comment-scraper”.
Paste one or more Reddit post URLs into the startUrls field (e.g., https://www.reddit.com/r/subreddit/comments/post_id/title/).
Set maxComments to control how many comments you collect per URL (1–10,000).
Set replyLimit to control how many replies are kept in each comment’s nested replies array (0 = unlimited).
(Optional) Configure proxyConfiguration if you want to force proxy use. The actor will automatically attempt fallback if requests are blocked.
Click Run and monitor real-time logs (you’ll see progress updates like “Collected X comments so far”).
Access your results in the OUTPUT tab: download the Dataset as CSV/JSON/Excel or fetch the grouped JSON from the Key-Value Store.

Pro tip: Use the Apify API to trigger runs and stream results into your analysis stack or data pipeline — ideal for building a repeatable Reddit comment exporter.

Use cases

Use case	Description
Market research – discussion mining	Analyze community opinions and themes across threads to inform positioning, messaging, and product strategy.
Sentiment analysis – NLP-ready data	Collect contentText at scale to train or evaluate models and dashboards that gauge public sentiment.
Trend tracking – topic monitoring	Track engagement and comment patterns on specific posts to surface emerging topics and narratives.
Community monitoring – moderation intel	Export comment-level data for oversight, reporting, and community health insights.
Academic research – social datasets	Build reproducible Reddit comments dataset samples for qualitative/quantitative studies.
Data engineering – API pipeline	Orchestrate automated runs, export JSON/CSV, and feed downstream data warehouses and BI tools.

Why choose Reddit Comment Scraper?

🎯 Precision-first extraction focused on clean comment fields, parent/child relationships, and identifiers.
⚡ Built for scale: process multiple post URLs and collect up to 10,000 comments per URL.
🧪 Developer-friendly Python actor with structured outputs for easy ETL into analytics stacks.
🛡️ Public data only: designed to collect from publicly available Reddit content.
🌐 Reliable infrastructure with direct → datacenter → residential proxy fallback and retry logic.
💾 Dual outputs (Dataset + grouped JSON) make it a flexible Reddit comment exporter for varied workflows.
💸 Try before you buy with available trial minutes on Apify, then scale to production with a simple monthly plan.

Is it legal / ethical to use Reddit Comment Scraper?

Yes — when used responsibly. This actor is designed to collect publicly available Reddit content only and does not access private or authenticated data.

Guidelines for compliant use:

Only scrape publicly accessible Reddit posts and comments.
Do not attempt to access private communities or protected content.
Respect Reddit’s terms and applicable laws (e.g., GDPR, CCPA) for your use case.
Use the data for analysis, research, and insights — avoid spam or misuse.
Consult your legal team for edge cases and jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
  "startUrls": [
    "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/"
  ],
  "maxComments": 1000,
  "replyLimit": 0,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

Input fields

startUrls
- Type: array
- Required: Yes
- Default: —
- Description: List one or more Reddit post URLs (e.g., https://www.reddit.com/r/subreddit/comments/post_id/title/). Accepts an array of strings; the actor also handles objects with a url property.
maxComments
- Type: integer
- Required: No
- Default: 1000
- Description: Maximum number of comments to fetch per URL. Minimum 1, maximum 10000.
replyLimit
- Type: integer
- Required: No
- Default: 0
- Description: Maximum number of replies to store per comment in the nested replies field. Set to 0 for unlimited. All replies are still traversed and emitted as flattened output records.
proxyConfiguration
- Type: object
- Required: No
- Default: { "useApifyProxy": false }
- Description: Choose which proxies to use. By default, no proxy is used. If Reddit rejects or blocks the request, it will fallback to datacenter proxy, then residential proxy with retries.

Example dataset item (one comment per row)

{
  "url": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/",
  "comment_id": "lhk1f7n",
  "post_id": "t3_1epeshq",
  "author": "AutoModerator",
  "permalink": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/",
  "upvotes": 1,
  "content_type": "text",
  "parent_id": "t3_1epeshq",
  "author_avatar": "",
  "userUrl": "https://www.reddit.com/user/AutoModerator/",
  "contentText": "Comment text here...",
  "created_time": "",
  "replies": []
}

Example grouped output (Key-Value Store, key: OUTPUT)

{
  "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/": [
    {
      "comment_id": "lhk1f7n",
      "post_id": "t3_1epeshq",
      "author": "AutoModerator",
      "permalink": "https://www.reddit.com/r/ChatGPT/comments/1epeshq/these_are_all_ai/lhk1f7n/",
      "upvotes": 1,
      "content_type": "text",
      "parent_id": "t3_1epeshq",
      "author_avatar": "",
      "userUrl": "https://www.reddit.com/user/AutoModerator/",
      "contentText": "Comment text here...",
      "created_time": "",
      "replies": []
    }
  ]
}

Notes on fields that may be empty:

author may be “[deleted]” for removed accounts.
userUrl is empty for “[deleted]” authors.
created_time and author_avatar are empty strings when not available.
parent_id is null for top-level comments.
The replies array may be large; replyLimit controls how many are stored per comment, but all replies are still emitted as individual records in the dataset.

FAQ

Is there a free tier or trial?

Yes. The actor offers trial minutes on Apify (e.g., 120 trial minutes) so you can test before subscribing. For ongoing use, a simple monthly plan is available.

Do I need to log in to Reddit or provide cookies?

No. The actor uses publicly available JSON endpoints and does not require Reddit login or cookies to scrape Reddit comments.

How many comments can I collect per URL?

You can set maxComments from 1 up to 10,000 per URL. The actor trims the output to your specified limit.

Does it capture nested replies?

Yes. All replies are traversed and emitted in the flattened output. The replyLimit parameter controls how many replies are stored per comment in the nested replies array.

What formats can I export?

You can export the Dataset to JSON, CSV, or Excel from the Apify Console, or access results programmatically via the Apify API. A grouped JSON is also saved in the Key-Value Store under the OUTPUT key.

Does this use the official Reddit API or PRAW?

No. It fetches public JSON endpoints on reddit.com directly (a Reddit API comment scraper approach without PRAW). It does not require OAuth.

What happens if Reddit blocks my requests?

The actor automatically falls back from direct connection to datacenter proxies and then to residential proxies with retries, maximizing the chance of success.

Can I scrape subreddit comments across multiple threads?

Provide the specific Reddit post URLs you want to process. You can list multiple post URLs to crawl many threads in one run.

Closing CTA / Final thoughts

Reddit Comment Scraper is built to turn Reddit conversations into clean, structured data for analysis. With automatic proxy fallback, bulk post URL processing, and dual-format outputs, it’s a dependable Reddit comment scraper Python tool for marketers, developers, analysts, and researchers. Export to CSV/JSON for dashboards or connect via the Apify API to automate your pipeline. Start extracting smarter Reddit insights and build your next Reddit comments dataset with confidence.

Reddit Posts Scraper

scrapium/reddit-posts-scraper

Scrape Reddit posts with ease 🧵👽 Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast 🚀

Scrapium

Reddit Comment Scraper

api-empire/reddit-comment-scraper

Automatically extract Reddit comments and replies using Reddit Comments Scraper. Collect comment text, usernames, scores, and timestamps to support sentiment tracking, brand monitoring, and community research.

API Empire

Reddit Api Scraper

scrapapi/reddit-api-scraper

Reddit API Scraper collects data from Reddit posts, comments, and subreddits using the Reddit API. Extract titles, post text, usernames, scores, timestamps, and comment threads. Ideal for trend analysis, sentiment research, community monitoring, and social data collection.

ScrapAPI

Reddit Comment Scraper

scraperx/reddit-comment-scraper

ScraperX

Reddit Comment Scraper

scraperforge/reddit-comment-scraper

ScraperForge

Reddit Comment Scraper

scrapepilotapi/reddit-comment-scraper

ScrapePilot

Reddit Comment Scraper

scrapapi/reddit-comment-scraper

🔎 Reddit Comment Scraper (reddit-comment-scraper) scrapes comments from threads and subreddits — with timestamps, authors, scores, and permalinks. 📈 Export to CSV/JSON for sentiment, keyword, and trend analysis. ⚡ Ideal for market research, community insights, and competitive intelligence.

ScrapAPI