Pricing

$1.50 / 1,000 results

Reddit Subreddit Scraper

Extract Reddit posts from any subreddit without an API key. Get titles, scores, authors, comment counts, flairs, and URLs from old.reddit.com.

Pricing

$1.50 / 1,000 results

Rating

0.0

(0)

Developer

Casey Marsh

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

Summary

The Reddit Subreddit Scraper is a production-grade Apify actor that extracts posts from any public subreddit without requiring a Reddit API key, app registration, or OAuth token. It uses old.reddit.com — Reddit's lightweight, server-rendered interface — which is far more scraper-friendly than the modern React-based Reddit frontend.

Built on Crawlee's CheerioCrawler with Apify residential proxy rotation, this actor handles Reddit's rate limiting gracefully, retries failed requests automatically, and extracts rich metadata including upvote ratios, NSFW flags, sticky post detection, award counts, and post flairs. Whether you need 10 posts or 500, the actor paginates automatically to collect your requested volume.

How It Works

Input: You provide a subreddit name, sort order, and maximum post count.
URL Construction: The actor builds the correct old.reddit.com/r/{subreddit}/{sort}/ URL — old Reddit renders complete HTML without JavaScript, making it ideal for Cheerio-based scraping.
Rate Limit Detection: On each page load, the actor checks for Reddit's rate limit or "blocked" pages and re-throws the error to trigger a retry with a fresh residential proxy IP.
Post Extraction: Each post (.thing.link) is parsed for title, URL, author, score, comment count, flair, domain, timestamp, and more using multiple CSS selector fallbacks.
Session Management: A Crawlee session pool rotates user agents and cookies to reduce fingerprinting.
Pagination: The actor continuously scrapes until it reaches your maxPosts limit.
Output: Clean, structured JSON saved to your Apify dataset with ISO 8601 timestamps.

Input Parameters

Field	Type	Required	Default	Description
`subreddit`	string	No	`popular`	Subreddit name without the `r/` prefix (e.g. `australia`, `programming`, `worldnews`)
`sort`	string	No	`hot`	Sort order: `hot`, `new`, `top`, or `rising`
`maxPosts`	integer	No	`50`	Maximum number of posts to extract (1–500)
`includeComments`	boolean	No	`false`	Whether to also scrape comments from each post (adds significant runtime)

Output Example

{
  "title": "What's the most interesting fact you learned this week?",
  "url": "https://example.com/article",
  "author": "curious_user42",
  "subreddit": "popular",
  "score": 15420,
  "commentCount": 2301,
  "flair": "Discussion",
  "postedAt": "2026-07-04T08:00:00.000Z",
  "domain": "self.popular",
  "redditUrl": "https://old.reddit.com/r/popular/comments/abc123/",
  "isNSFW": false,
  "isSticky": false,
  "awards": 3,
  "upvoteRatio": "0.92",
  "sort": "hot",
  "scrapedAt": "2026-07-04T10:30:00.000Z"
}

Pricing

This actor uses Apify's pay-per-result model. You only pay for the posts you successfully extract. No monthly subscriptions, no Reddit API costs, no minimums. A typical run of 50 posts from a single subreddit costs a fraction of an Apify platform credit.

Because the actor uses old.reddit.com (static HTML served directly from Reddit's servers without JavaScript rendering), it is extremely efficient — no headless browser overhead. Residential proxies provide reliability against rate limiting, but you can switch to datacenter proxies for lower costs if your use case allows occasional blocks.

Use Cases

Social Media Monitoring: Track trending topics, brand mentions, and community discussions across multiple subreddits. Monitor sentiment around products, companies, or public figures in real time.
Content Research and Curation: Discover viral content, identify trending formats, and understand what resonates with specific communities. Source content ideas for blogs, newsletters, and social media channels.
Community Sentiment Analysis: Analyze post titles, flairs, and scores to gauge community sentiment on specific topics. Feed scraped data into NLP pipelines for large-scale sentiment tracking.
Data Collection for Machine Learning: Build training datasets for text classification, toxicity detection, or recommendation systems using Reddit's diverse, community-labeled content.
Competitor Research: Monitor competitor subreddits, track product announcement threads, and analyze community engagement patterns.
Market Research: Understand consumer pain points, feature requests, and product discussions within niche communities relevant to your industry.

FAQ

Q: Do I need a Reddit API key or OAuth token? A: No. This actor scrapes publicly available pages on old.reddit.com. No Reddit account, API key, or authentication is required. This is one of its key advantages over the official Reddit API, which requires app registration and has stricter rate limits on the free tier.

Q: Why use old.reddit.com instead of the new Reddit? A: Old Reddit renders complete HTML server-side with predictable CSS classes (.thing.link, .score.unvoted, .linkflairlabel). The new Reddit is a React SPA that requires JavaScript execution, making it much slower and more expensive to scrape. Old Reddit is also less aggressively rate-limited.

Q: What if a subreddit is private or banned? A: The actor can only scrape public subreddits. If a subreddit is private, banned, or quarantined, the request will fail, and an error record will be saved to the dataset.

Q: Can I scrape comments too? A: Basic comment count extraction is included. For full comment scraping (comment bodies, nested threads), set includeComments to true. Note this significantly increases runtime and data volume, as each post spawns additional requests.

Q: How does the actor handle Reddit's rate limiting? A: When a rate limit (HTTP 429) or block page is detected — identified by checking the page title — the error is re-thrown to Crawlee's retry mechanism, which rotates to a fresh residential proxy IP and retries the request automatically. Up to 4 retries are attempted.

Q: Is it legal to scrape Reddit? A: This actor scrapes publicly accessible pages. You are responsible for complying with Reddit's terms of service, robots.txt, and applicable laws. For production use at scale, review Reddit's API terms and consider using the official API for sensitive data.

Actor ID: reddit-subreddit-scraper · Runtime: Node.js 20 · Type: CheerioCrawler

Reddit Posts Scraper

johnlenflure/reddit-scraper

Scrape Reddit posts from any subreddit. Extract titles, scores, comments, authors, flairs. Uses old.reddit.com for reliable HTML.

Sinan Donmez

Reddit Scraper

gio21/reddit-scraper

Scrape Reddit posts and comments from any subreddit. Extract titles, scores, authors, comments, and more using Reddit's public JSON API.

Gio

5.0

👽 Reddit Scraper — Posts & Comments by Subreddit or Search

iskoren/reddit-scraper

Scrape Reddit posts and comments from any subreddit or search query — scores, authors, timestamps, flairs, and full text. Export structured data for research and monitoring.

Is Koren

Reddit Scraper

ef12/reddit-scraper

Scrape Reddit posts and comments by subreddit, search query, or user. Get titles, scores, upvote ratios, comment counts, post bodies, and flairs via the Reddit JSON API.

Daniel Wilson

Reddit Subreddit Scraper — Posts, Scores & Comment Counts

maged120/reddit-subreddit

Scrape posts from any Reddit subreddit. Get titles, scores, comment counts, authors, timestamps, and links. Supports hot, new, top, and rising sort orders.

Maged

Reddit Post & Comment Scraper

miccho27/reddit-post-scraper

Scrape Reddit posts and comments from any subreddit or thread URL. Extract titles, scores, authors, comment trees, and metadata. No Reddit API key or OAuth required.

Tatsuya Mizuno

Reddit Scraper — Extract Posts, Comments & Subreddits

oneary/reddit-scraper

Scrape Reddit posts, comments, subreddits, and user data with Playwright. Extract titles, scores, authors, flairs, and comment counts from any subreddit or search.

Luan M.

Reddit Posts & Comments Scraper

rupom888/reddit-posts-scraper

Scrape Reddit posts, comments, subreddits, and user profiles without login. Search by keyword across Reddit or within a subreddit. Extract post scores, vote ratios, comment counts, awards, flairs, and full comment threads. Uses Reddit's public JSON API — fast and reliable.

Syed Rupom

Reddit Post & Comment Scraper

fluxcurulin/reddit-scraper

Scrape posts and comments from any subreddit via old.reddit.com. Extract titles, scores, authors, timestamps, comment threads, and nested replies. Ideal for sentiment analysis, trend tracking, brand monitoring, and academic research.

Josh Pinkerton

Reddit Scraper

automation-lab/reddit-scraper

Scrape public Reddit search results and subreddit listings, with posts, comments, and profiles available on a best-effort basis. No Reddit account or API key required.