Pricing

from $2.00 / 1,000 post scrapeds

Reddit Post & Comment Scraper

Scrape unlimited comments from any posts with 99% accuracy (highest of the Apify Store). Input any Reddit post URL and get complete, rich JSON data, including deeply nested comment threads, scores, author details, and awards. Comments tree is already built for you.

Pricing

from $2.00 / 1,000 post scrapeds

Rating

0.0

(0)

Developer

Ion Belei

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

Reddit Post & Comment Scraper — Scrape Reddit with 99% Accuracy

Scrape unlimited comments from any posts with the highest comment accuracy on the Apify Store. Input any Reddit post URL and get complete, rich JSON data, including deeply nested comment threads, scores, author details, and awards.

Unlike other Reddit scrapers, this Actor does not use a browser. That means lower compute costs for you and faster extraction times. You get the full data Reddit has on a post, not just what's visible on the web page, in tree formated way just like on reddit.

Why This Reddit Scraper?

Feature	This Actor	Most Actors
Comment scraping accuracy	99% (100% for posts under 500 comments)	40-80% (miss nested replies)
Data depth	Full Reddit JSON (80+ fields per post)	Surface-level web scrape
Browser required	No (lightweight HTTP)	Yes (Playwright/Puppeteer)
Compute cost	Low	3-5x higher
Comment Tree	Already Created	Missing, need to do it yourself

What Data Can You Extract from Reddit?

Each scraped post includes 80+ data fields, far more than what you see on the Reddit website.

Post Data

Content: title, selftext, URL, domain, permalink
Metrics: score, upvotes, downvotes, upvote_ratio, num_comments, num_crossposts
Author: username, author_fullname, author_premium, author_flair
Subreddit: subreddit name, subscriber count, subreddit type
Metadata: created_utc, edited, archived, locked, spoiler, over_18 (NSFW)
Awards: all_awardings, total_awards_received, gilded
Media: media, media_embed, is_video, thumbnail

Comment Data (nested with full reply threads)

Content: body, body_html
Metrics: score, ups, downs, controversiality
Author: author, author_fullname, author_premium, author_flair
Structure: parent_id, depth, link_id, is_submitter
Metadata: created_utc, edited, stickied, distinguished, collapsed
Nested replies: Full recursive reply threads preserved in tree structure

Use Cases of the Reddit Scraper

Sentiment analysis — Extract thousands of comments with scores and metadata to analyze public opinion on any topic, product, or brand.
AI/ML training data — Build training datasets from Reddit's rich comment threads for NLP models, chatbots, and language research.
Brand monitoring — Track what people say about your brand or product by scraping comment threads from relevant subreddits.
Market research — Analyze discussions in niche subreddits to understand customer pain points, feature requests, and competitor perception.
Content research — Find high-engagement discussions and trending topics to fuel your content strategy.
Academic research — Collect structured Reddit data for social science, linguistics, and behavioral research.

How to Scrape Reddit Posts and Comments

Step 1: Prepare Your Input

Add Reddit post URLs to the input. You can scrape one post or many at once.

{
  "urls": [
    {
      "url": "https://www.reddit.com/r/AskReddit/comments/1rcxhjq/people_40_what_actually_mattered_in_the_long_run/"
    },
    {
      "url": "https://www.reddit.com/r/technology/comments/example_post/"
    }
  ],
  "sort_type": "top",
  "max_comments": null,
  "lite_mode": false
}

Step 2: Configure Options

Parameter	Type	Default	Description
`urls`	Array	Required	List of Reddit post URLs to scrape.
`sort_type`	String	`"top"`	How to sort comments: `top`, `best`, `new`, `controversial`, `old`, `qa`.
`max_comments`	Number	`null`	Maximum number of comments to extract per post. Set to `null` for all comments.
`lite_mode`	Boolean	`false`	If `true`, returns lightweight comment objects with only core visible fields.

Step 3: Run and Export

Click Start to begin scraping. Once finished, go to the Dataset tab to view results. Export options:

JSON — Full structured data, ideal for programmatic use
CSV — Flattened data for spreadsheets (note: nested comments are best consumed as JSON)
Excel — Direct download for quick analysis
XML — For XML-based workflows
API — Access results programmatically via the Apify REST API

Output Schema

Each dataset item represents one scraped Reddit post with its complete comment tree.

Default mode (lite_mode: false): returns full Reddit JSON fields for post + comments.
Lightweight mode (lite_mode: true): keeps full post fields but returns reduced comment objects.

Additional fields may appear depending on the post type.

Post Fields

Field	Type	Description
`title`	String	The title of the Reddit post
`author`	String	Username of the post author
`author_fullname`	String	Reddit's internal fullname (e.g., `t2_abc123`)
`author_premium`	Boolean	Whether the author has Reddit Premium
`subreddit`	String	Subreddit name (without `r/` prefix)
`subreddit_id`	String	Reddit's internal ID for the subreddit
`subreddit_subscribers`	Integer	Number of subscribers in the subreddit
`id`	String	The post's unique ID
`name`	String	The post's fullname (e.g., `t3_abc123`)
`selftext`	String	Post text content (empty string for link posts)
`selftext_html`	String/Null	HTML-rendered version of the post text
`score`	Integer	Net score (upvotes minus downvotes)
`ups`	Integer	Number of upvotes
`upvote_ratio`	Number	Ratio of upvotes to total votes (0.0–1.0)
`num_comments`	Integer	Total comment count (as reported by Reddit)
`created_utc`	Number	Unix timestamp of creation
`edited`	Boolean/Number	`false` if not edited, or Unix timestamp of last edit
`permalink`	String	Relative URL path to the post
`url`	String	The original post URL or the submitted link URL
`domain`	String	Domain of the URL (e.g., `self.AskReddit`, `i.imgur.com`)
`over_18`	Boolean	Whether the post is NSFW
`spoiler`	Boolean	Whether the post is a spoiler
`locked`	Boolean	Whether the post is locked
`archived`	Boolean	Whether the post is archived
`stickied`	Boolean	Whether the post is pinned
`is_video`	Boolean	Whether the post contains a Reddit-hosted video
`thumbnail`	String	Thumbnail URL or keyword (`self`, `default`, `nsfw`)
`total_awards_received`	Integer	Total number of awards received
`all_awardings`	Array	Detailed list of all awards
`gilded`	Integer	Number of times gilded
`link_flair_text`	String/Null	Post flair tag text
`num_crossposts`	Integer	Number of crossposts
`comments`	Array	Full nested comment tree (see below)
`success`	Boolean	Whether the post was successfully scraped
`error`	String/Null	Error message if scraping failed, `null` otherwise

Conditional Post Fields

These fields appear depending on the post type:

Field	Type	When present
`media`	Object/Null	Video/embed posts — structure varies by media type (Reddit video, YouTube embed, etc.)
`media_embed`	Object	Posts with embedded media
`preview`	Object	Image/link posts — contains `images` array with `source` and `resolutions`
`post_hint`	String	Certain post types: `image`, `link`, `hosted:video`, `rich:video`, `self`
`url_overridden_by_dest`	String	Link posts — the destination URL
`crosspost_parent_list`	Array	Crossposted content — includes the original post data

Comment Fields (Default Mode)

Each comment in the comments array contains these fields, plus a replies array with the same structure (recursively for the full thread):

Field	Type	Description
`id`	String	The comment's unique ID
`author`	String	Username of the comment author
`author_fullname`	String	Reddit's internal fullname for the author
`body`	String	Comment text in Markdown
`body_html`	String	HTML-rendered comment body
`score`	Integer	Net score
`ups`	Integer	Number of upvotes
`downs`	Integer	Number of downvotes (usually 0 due to vote fuzzing)
`created_utc`	Number	Unix timestamp of creation
`edited`	Boolean/Number	`false` if not edited, or Unix timestamp of last edit
`parent_id`	String	Parent fullname (`t3_postid` for top-level, `t1_commentid` for replies)
`link_id`	String	Parent post fullname (`t3_postid`)
`depth`	Integer	Nesting depth (0 = top-level)
`is_submitter`	Boolean	Whether the commenter is the OP
`permalink`	String	Relative URL path to the comment
`controversiality`	Integer	Whether the comment is controversial (0 or 1)
`distinguished`	String/Null	`moderator` or `admin` if distinguished, `null` otherwise
`stickied`	Boolean	Whether the comment is pinned
`collapsed`	Boolean	Whether the comment is collapsed by default
`author_premium`	Boolean	Whether the author has Reddit Premium
`subreddit`	String	Subreddit name
`total_awards_received`	Integer	Total awards on this comment
`all_awardings`	Array	Detailed list of all awards
`replies`	Array	Nested reply comments (same structure, recursively)

Comment Fields (Lightweight Mode)

When lite_mode is enabled, each comment (including nested replies) contains only:

Field	Type	Description
`id`	String	The comment's unique ID
`parent_id`	String	Parent fullname (`t3_postid` for top-level, `t1_commentid` for replies)
`author`	String	Username of the comment author
`body`	String	Comment text in Markdown
`created_utc`	Number	Unix timestamp of creation
`ups`	Integer	Number of upvotes
`replies`	Array	Nested reply comments in the same lightweight structure

Output Example

{
  "title": "People 40+, what actually mattered in the long run and what didn't?",
  "author": "Psychological_Sky_58",
  "subreddit": "AskReddit",
  "score": 9965,
  "upvote_ratio": 0.95,
  "num_comments": 4987,
  "created_utc": 1771888793,
  "url": "https://www.reddit.com/r/AskReddit/comments/1rcxhjq/...",
  "selftext": "",
  "permalink": "/r/AskReddit/comments/1rcxhjq/people_40_what_actually_mattered_in_the_long_run/",
  "over_18": false,
  "is_video": false,
  "comments": [
    {
      "author": "Right-Breakfast444",
      "body": "Jesus Christ...that looks a little bit like Jason Bourne!",
      "score": 63,
      "depth": 6,
      "created_utc": 1771938473,
      "parent_id": "t1_o74gb87",
      "permalink": "/r/AskReddit/comments/1rcxhjq/.../o74ohw9/",
      "replies": [
        {
          "author": "Alarming-Research-42",
          "body": "lol.\nWe went from the Bourne Supremacy to Bourne Approximation.",
          "score": 7,
          "depth": 7,
          "replies": []
        }
      ]
    }
  ],
  "success": true,
  "error": null
}

Note: The example above shows key fields for readability. The actual output contains 80+ fields per post and per comment, as documented in the tables above.

Roadmap

Planned updates for this Actor:

100% comment accuracy — Not miss a single comment.
Subreddit scraping — Input a subreddit URL to get all posts
User profile scraping — Extract post and comment history from Reddit user profiles
Speed improvements — Batch processing for faster runs

Integrations

JavaScript / TypeScript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor("YOUR_ACTOR_ID").call({
  urls: [
    { url: "https://www.reddit.com/r/AskReddit/comments/1rcxhjq/people_40_what_actually_mattered_in_the_long_run/" }
  ],
  sort_type: "top",
  max_comments: null
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(post => {
  console.log(`${post.title} — ${post.num_comments} comments, score: ${post.score}`);
  console.log(`Comments extracted: ${post.comments.length}`);
});

Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")

run = client.actor("YOUR_ACTOR_ID").call(run_input={
    "urls": [
        {"url": "https://www.reddit.com/r/AskReddit/comments/1rcxhjq/people_40_what_actually_mattered_in_the_long_run/"}
    ],
    "sort_type": "top",
    "max_comments": None
})

for post in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{post['title']} — {post['num_comments']} comments")
    for comment in post.get("comments", []):
        print(f"  [{comment['score']}] {comment['author']}: {comment['body'][:80]}...")

FAQs for Reddit Scraper

How is this different from other Reddit scrapers?

Three key differences:

Comment accuracy — Most scrapers miss deeply nested replies. This Actor follows the full comment tree.
Rich data — You get 80+ fields per post/comment (the full data Reddit stores), not just what's visible on the webpage.
No browser — Runs without Playwright or Puppeteer, which means lower costs and faster runs.

Does it scrape entire subreddits?

Currently, this Actor scrapes individual Reddit posts with their full comment threads. Subreddit-level scraping (listing all posts from a subreddit) is on the roadmap and coming soon.

Can it scrape user profiles?

User profile scraping is planned for a future update. Currently, the Actor extracts author information (username, flair, premium status) from within the posts and comments it scrapes.

Do I need Reddit API keys?

No. This Actor does not require any Reddit API keys, OAuth tokens, or authentication. It works out of the box.

Can I export Reddit data to CSV or Excel?

Yes. Export directly from the Apify console as JSON, CSV, Excel, or XML.

What does the sort_type parameter do?

It controls how comments are sorted before extraction, matching Reddit's own sort options:

top — Highest scored comments first
best — Reddit's "best" algorithm (balances score and recency)
new — Most recent comments first
controversial — Most debated comments first
old — Oldest comments first
qa — Q&A format (OP replies prioritized)

Is Reddit scraping legal?

This Actor accesses only publicly available Reddit data. Users are responsible for complying with applicable laws, Reddit's Terms of Service, and their own jurisdiction's data regulations.

Legal and Compliance

This Actor extracts only publicly available data from Reddit. Users are responsible for:

Complying with Reddit's Terms of Service and API Terms
Following applicable data protection laws (GDPR, CCPA, etc.)
Using extracted data ethically and in accordance with their jurisdiction's regulations
Not using the data to harass, dox, or target individual Reddit users

This Actor does not bypass authentication, access private subreddits, or extract data that requires a logged-in session.

Reddit Posts & Comments Scraper — Full Thread Extraction

maged120/reddit-scraper

Scrape Reddit posts and full comment threads from any post URL. Extract title, score, author, timestamp, and all nested comments without login.

Maged

5.0

Reddit Post Comments Scraper

apiharvest/reddit-post-comments-scraper

Reddit Post Comments Scraper

APIHarvest

Reddit Scraper

optimus-fulcria/reddit-scraper

Scrape Reddit posts, comments, and subreddit data. Full nested comment threads, search queries, user profiles.

Fulcria Labs

Reddit Posts & Comments Scraper

rupom888/reddit-posts-scraper

Scrape Reddit posts, comments, subreddits, and user profiles without login. Search by keyword across Reddit or within a subreddit. Extract post scores, vote ratios, comment counts, awards, flairs, and full comment threads. Uses Reddit's public JSON API — fast and reliable.

Syed Rupom

Reddit Post & Subreddit Scraper

scrapeai/reddit-advanced-scraper

Collect structured Reddit data including posts, comments, author details, scores, awards, and timestamps using Reddit JSON endpoints.

ScrapeAI

5.0

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.

Tatsuya Mizuno

Reddit: Subreddit Posts Comments Actor

pintostudio/reddit-subreddit-posts-comments-actor

This actor allows you to scrape comments from any Reddit post. Simply provide a Reddit post URL, and the actor will fetch all comments with their metadata including author, score, timestamps, and nested replies.

Pinto Studio

Fast Reddit Scraper

timgreen/fast-reddit-scraper

Extract Reddit posts and comments from any subreddit or search query. Fast, reliable Reddit scraping with detailed metadata including upvotes, timestamps, and nested comment threads.

Tim Green

227

1.0

Reddit Comment Scraper

scraperforge/reddit-comment-scraper

ScraperForge

Reddit Comment Scraper

scrapebase/reddit-comment-scraper

ScrapeBase

Reddit Post & Comment Scraper

Reddit Post & Comment Scraper — Scrape Reddit with 99% Accuracy

Why This Reddit Scraper?

What Data Can You Extract from Reddit?

Post Data

Comment Data (nested with full reply threads)

Use Cases of the Reddit Scraper

How to Scrape Reddit Posts and Comments

Step 1: Prepare Your Input

Step 2: Configure Options

Step 3: Run and Export

Output Schema

Post Fields

Conditional Post Fields

Comment Fields (Default Mode)

Comment Fields (Lightweight Mode)

Output Example

Roadmap

Integrations

JavaScript / TypeScript

Python

FAQs for Reddit Scraper

How is this different from other Reddit scrapers?

Does it scrape entire subreddits?

Can it scrape user profiles?

Do I need Reddit API keys?

Can I export Reddit data to CSV or Excel?

What does the sort_type parameter do?

Is Reddit scraping legal?

Legal and Compliance

You might also like

Reddit Posts & Comments Scraper — Full Thread Extraction

Reddit Post Comments Scraper

Reddit Scraper

Reddit Posts & Comments Scraper

Reddit Post & Subreddit Scraper

Reddit Post & Comment Scraper

Reddit: Subreddit Posts Comments Actor

Fast Reddit Scraper

Reddit Comment Scraper

Reddit Comment Scraper