Reddit Post Scraper
Pricing
from $3.00 / 1,000 posts
Reddit Post Scraper
Scrape Reddit posts and comments without an API key. Get scores, upvote ratios, and timestamps. Ideal for community research and brand sentiment analysis.
Pricing
from $3.00 / 1,000 posts
Rating
0.0
(0)
Developer

Vhub Systems
Actor stats
0
Bookmarked
9
Total users
7
Monthly active users
5 hours ago
Last modified
Categories
Share
Reddit Post Scraper - Extract Posts, Comments, and Metadata from Subreddits
Extract Reddit posts and comments at scale without API limits. Scrape multiple subreddits, search results, and full comment threads in minutes.
What is Reddit Post Scraper?
Reddit Post Scraper is a powerful Apify actor that extracts structured data from Reddit subreddits and search results. Unlike the official Reddit API, which imposes strict rate limits and requires OAuth authentication, this scraper uses old.reddit.com's HTML interface to gather post metadata, content, and comments without authentication.
This tool is essential for market research, sentiment analysis, content monitoring, and competitive intelligence. Track trending topics across multiple subreddits, analyze community reactions to product launches, or monitor brand mentions across Reddit's vast ecosystem. The scraper handles pagination automatically and supports flexible sorting options (hot, new, top, rising) to capture exactly the content you need.
Whether you're a data analyst tracking industry discussions, a marketer monitoring brand sentiment, or a researcher studying online communities, Reddit Post Scraper delivers clean, structured data ready for analysis. Extract thousands of posts with metadata including scores, upvote ratios, comment counts, author information, timestamps, and post flairs. Optional comment scraping provides full conversation threads with nested reply depth tracking.
Output Data Fields
| Field | Type | Description |
|---|---|---|
title | String | Post title text |
author | String | Username of post author (null if deleted) |
score | Integer | Net upvotes (upvotes minus downvotes) |
upvoteRatio | Float | Ratio of upvotes to total votes (0.0-1.0, null if unavailable) |
commentCount | Integer | Total number of comments on the post |
subreddit | String | Subreddit name where post was published |
text | String | Post body text (null for link posts without text) |
url | String | Permalink URL to the post on Reddit |
createdAt | ISO 8601 DateTime | Post creation timestamp |
flair | String | Post flair label (null if no flair) |
comments | Array | Array of comment objects (only when includeComments is enabled) |
Comment Object Fields
| Field | Type | Description |
|---|---|---|
author | String | Username of comment author (null if deleted) |
body | String | Comment text content |
score | Integer | Comment score (upvotes minus downvotes) |
createdAt | ISO 8601 DateTime | Comment creation timestamp |
depth | Integer | Nesting level of the comment (0 for top-level) |
Tutorial: How to Scrape Reddit Posts in 7 Steps
1. Create a Free Apify Account
Sign up at apify.com to get free monthly credits for running actors.
2. Open Reddit Post Scraper
Navigate to the Reddit Post Scraper actor page and click "Try for free".
3. Configure Subreddits
Enter one or more subreddit names (without the "r/" prefix) in the "Subreddits" field. For example: python, machinelearning, datascience.
4. Set Search Query (Optional)
To search across subreddits instead of listing recent posts, enter a search query like "web scraping" or "API alternatives". Leave blank to scrape regular subreddit listings.
5. Choose Sort Order and Limit
Select how posts should be sorted: hot (trending), new (recent), top (highest score), or rising (gaining traction). Set maxPosts to control how many posts to scrape (default: 25).
6. Enable Comments (Optional)
Check "Include comments" to scrape full comment threads for each post. This significantly increases run time but provides complete conversation data.
7. Run and Export Data
Click "Start" to begin scraping. Once complete, download results as JSON, CSV, Excel, or connect directly to your database via Apify API.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
subreddits | Array of Strings | Conditional* | [] | List of subreddit names to scrape (without "r/" prefix). Example: ["javascript", "webdev"] |
searchQuery | String | Conditional* | "" | Search query to execute. If provided with subreddits, search is restricted to those subreddits. Example: "best practices" |
maxPosts | Integer | No | 25 | Maximum number of posts to scrape. Minimum: 1 |
sortBy | Enum | No | "hot" | Sort order for posts. Options: hot, new, top, rising |
includeComments | Boolean | No | false | Whether to scrape comments for each post. Increases runtime significantly. |
*Note: At least one of subreddits or searchQuery must be provided.
Example Input Configuration
{"subreddits": ["python", "learnpython", "datascience"],"searchQuery": "pandas dataframe","maxPosts": 100,"sortBy": "top","includeComments": true}
This configuration searches for "pandas dataframe" across three Python-related subreddits, retrieves the top 100 posts by score, and includes all comments for each post.
Example Output Data
{"title": "Best way to merge multiple DataFrames in pandas?","author": "data_enthusiast_42","score": 847,"upvoteRatio": 0.94,"commentCount": 63,"subreddit": "datascience","text": "I have 5 DataFrames with different columns but the same index. What's the most efficient way to merge them? I've tried pd.concat() but wondering if there's a better approach for memory efficiency.","url": "https://old.reddit.com/r/datascience/comments/1b2c3d4/best_way_to_merge_multiple_dataframes_in_pandas/","createdAt": "2026-02-12T14:23:17.000Z","flair": "Discussion","comments": [{"author": "pandas_expert","body": "Use pd.concat() with axis=1 for column-wise concatenation. If you have millions of rows, consider using pd.merge() sequentially with join='outer' to preserve all data.","score": 124,"createdAt": "2026-02-12T14:45:33.000Z","depth": 0},{"author": "data_scientist_pro","body": "For large datasets, look into Dask or Polars. They handle out-of-memory operations much better than pandas.","score": 89,"createdAt": "2026-02-12T15:12:08.000Z","depth": 0},{"author": "data_enthusiast_42","body": "Thanks! I'll try pd.concat with axis=1. My datasets are around 2M rows each so memory is definitely a concern.","score": 34,"createdAt": "2026-02-12T15:34:22.000Z","depth": 1}]}
Legal and Ethical Considerations
This scraper extracts publicly available data from Reddit using the old.reddit.com interface. All scraped content is accessible without authentication to any internet user. You are responsible for ensuring your use of scraped data complies with Reddit's Terms of Service, applicable laws, and regulations including GDPR, CCPA, and other data protection frameworks. The tool is designed for research, market analysis, and personal projects.
Always respect rate limits and implement reasonable delays between requests to avoid overloading Reddit's servers. Do not use scraped data for harassment, spam, or violating individual privacy rights. When publishing research or analysis based on Reddit data, consider anonymizing usernames and following academic ethical guidelines. The actor developer assumes no liability for misuse of this tool or violation of third-party terms of service.
Pricing and Performance
This actor uses Apify's residential proxies (US) to avoid IP blocking and ensure reliable scraping. Typical costs:
- Without comments: 100 posts cost approximately 0.01-0.02 USD (10-20 compute units)
- With comments: 100 posts with comments cost approximately 0.05-0.10 USD (50-100 compute units), depending on comment volume
Apify provides free monthly credits for new users. Runtime scales linearly with the number of posts and comments. Scraping 1,000 posts without comments typically completes in 5-10 minutes. With comments enabled, expect 15-30 minutes depending on thread sizes.
The actor uses conservative concurrency (1 concurrent request) to minimize detection risk and ensure stable operation. Residential proxies are recommended for consistent access to Reddit without triggering anti-bot measures.
Frequently Asked Questions
Can I scrape private or restricted subreddits?
No. This scraper only accesses publicly available content visible on old.reddit.com without authentication. Private subreddits, age-restricted content, and posts requiring login cannot be scraped.
Why is upvoteRatio sometimes null?
The upvote ratio is not always present in old.reddit.com's HTML, particularly for older posts or certain subreddit configurations. When the ratio is unavailable in the source HTML, the field returns null.
How do I avoid getting blocked by Reddit?
The actor uses residential proxies by default, which significantly reduces blocking risk. Additionally, keeping maxPosts under 500 per run and spacing out multiple runs by several hours helps maintain reliability. Avoid running dozens of concurrent actor instances targeting the same subreddits.
Can I scrape historical posts from specific dates?
This scraper works with Reddit's native sort options (hot, new, top, rising). To target specific date ranges, use the top sort option combined with search queries, or run the scraper periodically to build a historical dataset over time. Reddit's search and sort features do not support precise date filtering via old.reddit.com.
What happens if a post or comment is deleted during scraping?
Deleted or removed content typically shows null for the author field and may have placeholder text like [deleted] or [removed] in the body. The scraper captures whatever is visible in the HTML at the time of extraction.
Related Reddit Scrapers by lanky_quantifier
Explore our other specialized Reddit scrapers for comprehensive social media intelligence:
- Reddit User Profile Scraper - Extract user post history, karma, and account details
- Reddit Search Scraper - Advanced search across all of Reddit with filtering options
- Reddit Trending Topics Monitor - Track rising posts and trending discussions in real-time
- Subreddit Analytics Scraper - Gather subreddit statistics, subscriber counts, and activity metrics
- Twitter Thread Scraper - Extract Twitter/X threads and conversations similar to Reddit discussions
Need help? Contact support at vhubsystems@gmail.com or visit Apify documentation for integration guides.
Proxy Recommendations
For best results, use residential or datacenter proxies:
- 🔗 Smartproxy — Residential proxies with 195+ locations. 50% revenue share for referrals. Best for social media and SERP scraping.
- 🔗 Bright Data — Enterprise-grade proxies used by Fortune 500. Up to 25% commission. Best for large-scale extraction.
💡 These actors work out-of-the-box with Apify's built-in proxy pool too.
🛒 Get More Tools
Looking for ready-made automation kits?
- 📦 Apify Scrapers Bundle ($29) — 10+ actors in one package
- ⚡ n8n Automation Pack ($39) — Pre-built workflows for scraping pipelines