Reddit Scraper — Posts & Comments avatar

Reddit Scraper — Posts & Comments

Pricing

from $1.00 / 1,000 posts

Go to Apify Store
Reddit Scraper — Posts & Comments

Reddit Scraper — Posts & Comments

Scrape posts and comments from any subreddit — no Reddit API key, no login, no proxy. A fast, free Reddit API alternative for public data, exported to JSON, CSV or Excel.

Pricing

from $1.00 / 1,000 posts

Rating

0.0

(0)

Developer

James Taylor

James Taylor

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

19 hours ago

Last modified

Share

This Reddit scraper pulls posts and their top-level comments from any subreddit — fast, login-free, no API key, no proxy. Pick your subreddits and sort order, and export clean, structured Reddit data to JSON, CSV, or Excel in one run.

It's the lightweight, Reddit RSS scraper built for researchers, marketers, data teams, and builders who need Reddit posts and comments without accounts, OAuth, or rate-limit headaches.

Lightweight Reddit scraper vs. the Comment Tree Scraper

This actor reads Reddit's public RSS feeds. That's what makes it fast and cheap — but RSS only exposes text content (title, author, body, link, timestamp), not engagement counts or nested reply threads. So score and numComments come back null, and comments are a flat list of top-level replies.

Use this lightweight subreddit scraper for a quick, cheap pass over recent posts across many subreddits — titles, bodies, authors, permalinks, and top-level comments as text, with no proxy and no Reddit account.

Use the premium Reddit Comment Tree Scraper (actor reddit-deep-comments) instead when you need upvote scores, full nested comment trees (replies-to-replies, at depth), and richer engagement metrics that only its heavier, residential-proxy-backed fetch can reach.

Same data shape, different depth — pick the one that matches the job.

What it does

  • Scrapes posts from one or more subreddits, sorted by hot, new, rising, or top.
  • Optionally attaches each post's top-level comments (one extra request per post).
  • Returns clean records — title, author, body, permalink, timestamp, and a nested comments array — ready to export to JSON/CSV/Excel or pull via the Apify API.
  • Caps your spend with a hard maxPosts limit and stays polite with low default concurrency.

Who it's for

  • Researchers & data teams collecting subreddit text for analysis or datasets.
  • Marketers & community managers tracking what people post and discuss in their niches.
  • Builders & founders who want raw Reddit posts feeding their own pipeline — no login, proxy, or API key required.

Input

FieldTypeDefaultDescription
subredditsarray["SaaS"]Subreddits to scrape (with or without the r/ prefix), e.g. "SaaS", "technology". Required.
sortstringhotWhich listing to scrape: hot, new, rising, or top.
maxPostsinteger100Total posts to scrape across all subreddits (caps your spend).
includeCommentsbooleantrueFetch each post's top-level comments (one extra request per post).
maxCommentsPerPostinteger20Cap comments captured per post (0–100).
maxConcurrencyinteger4Parallel requests (1–20, kept low to stay polite).

Example input

{
"subreddits": ["SaaS", "Entrepreneur"],
"sort": "hot",
"maxPosts": 50,
"includeComments": true,
"maxCommentsPerPost": 25,
"maxConcurrency": 4
}

How to run

  1. Click Try for free (or open the actor in your Apify Console).
  2. Enter the subreddits you want to scrape and pick a sort (new surfaces the freshest posts; top and hot surface the most-seen).
  3. Set maxPosts to cap your spend.
  4. (Optional) Toggle includeComments and set maxCommentsPerPost to control how many top-level comments come back with each post.
  5. Click Start. When the run finishes, open the Dataset tab and export to JSON/CSV/Excel, or pull it via the API (below).

Run it on a schedule (Apify Schedules) for a fresh pull every morning, or call it from Make / Zapier / n8n via the Apify integrations.

Output

Each dataset item is a post with its top-level comments:

{
"type": "post",
"id": "1tuxy4e",
"subreddit": "SaaS",
"author": "Significant-Honey204",
"title": "Can anyone help me get reviews on G2?",
"body": "We just launched and...",
"postUrl": "https://www.reddit.com/r/SaaS/comments/1tuxy4e/can_anyone_help_me_get_reviews_on_g2",
"createdAt": "2026-06-02T09:12:00.000Z",
"score": null,
"numComments": null,
"commentCount": 5,
"comments": [
{
"author": "Impossible-Ebb-2446",
"body": "Try asking in your onboarding emails.",
"commentUrl": "https://www.reddit.com/r/SaaS/comments/1tuxy4e/_/opcuxfu",
"createdAt": "2026-06-02T09:40:00.000Z"
}
]
}

Field notes:

  • title / body / author / postUrl / createdAt come straight from the post's RSS entry — exactly what Reddit publishes, never fabricated.
  • commentCount is the number of top-level comments captured (capped by maxCommentsPerPost), and comments is a flat array of top-level replies — each with author, body, commentUrl, and createdAt.
  • score and numComments are always null. RSS doesn't expose upvote counts or comment totals, so this actor returns null rather than guess. If you need engagement scores or full nested comment trees, use the Reddit Comment Tree Scraper (reddit-deep-comments) instead.

Export & API

# Last run's dataset items as JSON
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?format=json&token=<APIFY_TOKEN>"

Or use the run-sync-get-dataset-items endpoint to run-and-wait in a single call — handy for embedding the actor in your own backend.

Pricing

Apify Pay-Per-Event — you're charged per post returned (comments are included at no extra charge). Set maxPosts to cap your spend.

Limitations

  • No engagement counts. RSS doesn't expose upvote score or numComments, so those are null — we never fabricate them.
  • Top-level comments, flat. RSS returns top-level comments without nested reply trees or per-comment scores. For scores and full threaded trees, use the Reddit Comment Tree Scraper.
  • RSS depth. Each subreddit feed returns its most recent ~25 posts; scan more subreddits or schedule runs for broader coverage rather than a full historical export.

Compliance

This actor reads public Reddit RSS only, identifies itself with a descriptive User-Agent, runs at modest concurrency, and never logs in, posts, votes, or messages. You are responsible for using the exported Reddit data in line with Reddit's terms and any laws that apply to you.

FAQ

Do I need a Reddit account, API key, or proxy to use this Reddit scraper? No. It reads public Reddit RSS feeds with plain HTTP requests — no login, no OAuth, no API key, and no proxy.

Why are score and numComments null? Reddit only exposes upvote and comment counts on endpoints it blocks to scrapers. This actor reads RSS, which doesn't include them, so it returns null rather than guess.

How do I get upvote scores or full comment trees? Use the premium Reddit Comment Tree Scraper (actor reddit-deep-comments). It uses a heavier, residential-proxy-backed fetch to return upvote scores and complete nested comment trees — the depth this lightweight RSS scraper can't reach.

How is it priced and how do I control cost? Apify Pay-Per-Event — you're charged per post returned, with comments included free. Set maxPosts to cap your spend before each run.

How many comments per post does it return? As many top-level comments as you set in maxCommentsPerPost (default 20, up to 100). Set it to 0, or turn off includeComments, to scrape posts only.

Which sort order should I pick? Use new for the freshest posts, hot or top for the most-engaged threads, and rising to catch posts gaining traction. The feed returns roughly the most recent ~25 posts per subreddit.

Can I export Reddit data to CSV or Excel, and how fresh is it? Yes — every run's dataset exports to JSON, CSV, or Excel from the Apify Console, or via the API. The actor reads the live RSS feed on each run, so results reflect the subreddit at run time. Pair sort: "new" with an Apify Schedule to catch posts as they appear.


Looking for leads, not raw data?

If you want buyer-intent posts turned into a lead list, see our Reddit Lead Finder. And if you'd like the whole outbound loop automated — intent discovery, enrichment, AI-personalised outreach, and reply handling — that's what we build at SignalEngine.