Reddit Post Scraper avatar

Reddit Post Scraper

Pricing

from $3.00 / 1,000 posts

Go to Apify Store
Reddit Post Scraper

Reddit Post Scraper

Scrape Reddit posts and comments without an API key. Get scores, upvote ratios, and timestamps. Ideal for community research and brand sentiment analysis.

Pricing

from $3.00 / 1,000 posts

Rating

0.0

(0)

Developer

Vhub Systems

Vhub Systems

Maintained by Community

Actor stats

0

Bookmarked

9

Total users

7

Monthly active users

5 hours ago

Last modified

Share

Reddit Post Scraper - Extract Posts, Comments, and Metadata from Subreddits

Extract Reddit posts and comments at scale without API limits. Scrape multiple subreddits, search results, and full comment threads in minutes.

What is Reddit Post Scraper?

Reddit Post Scraper is a powerful Apify actor that extracts structured data from Reddit subreddits and search results. Unlike the official Reddit API, which imposes strict rate limits and requires OAuth authentication, this scraper uses old.reddit.com's HTML interface to gather post metadata, content, and comments without authentication.

This tool is essential for market research, sentiment analysis, content monitoring, and competitive intelligence. Track trending topics across multiple subreddits, analyze community reactions to product launches, or monitor brand mentions across Reddit's vast ecosystem. The scraper handles pagination automatically and supports flexible sorting options (hot, new, top, rising) to capture exactly the content you need.

Whether you're a data analyst tracking industry discussions, a marketer monitoring brand sentiment, or a researcher studying online communities, Reddit Post Scraper delivers clean, structured data ready for analysis. Extract thousands of posts with metadata including scores, upvote ratios, comment counts, author information, timestamps, and post flairs. Optional comment scraping provides full conversation threads with nested reply depth tracking.

Output Data Fields

FieldTypeDescription
titleStringPost title text
authorStringUsername of post author (null if deleted)
scoreIntegerNet upvotes (upvotes minus downvotes)
upvoteRatioFloatRatio of upvotes to total votes (0.0-1.0, null if unavailable)
commentCountIntegerTotal number of comments on the post
subredditStringSubreddit name where post was published
textStringPost body text (null for link posts without text)
urlStringPermalink URL to the post on Reddit
createdAtISO 8601 DateTimePost creation timestamp
flairStringPost flair label (null if no flair)
commentsArrayArray of comment objects (only when includeComments is enabled)

Comment Object Fields

FieldTypeDescription
authorStringUsername of comment author (null if deleted)
bodyStringComment text content
scoreIntegerComment score (upvotes minus downvotes)
createdAtISO 8601 DateTimeComment creation timestamp
depthIntegerNesting level of the comment (0 for top-level)

Tutorial: How to Scrape Reddit Posts in 7 Steps

1. Create a Free Apify Account

Sign up at apify.com to get free monthly credits for running actors.

2. Open Reddit Post Scraper

Navigate to the Reddit Post Scraper actor page and click "Try for free".

3. Configure Subreddits

Enter one or more subreddit names (without the "r/" prefix) in the "Subreddits" field. For example: python, machinelearning, datascience.

4. Set Search Query (Optional)

To search across subreddits instead of listing recent posts, enter a search query like "web scraping" or "API alternatives". Leave blank to scrape regular subreddit listings.

5. Choose Sort Order and Limit

Select how posts should be sorted: hot (trending), new (recent), top (highest score), or rising (gaining traction). Set maxPosts to control how many posts to scrape (default: 25).

6. Enable Comments (Optional)

Check "Include comments" to scrape full comment threads for each post. This significantly increases run time but provides complete conversation data.

7. Run and Export Data

Click "Start" to begin scraping. Once complete, download results as JSON, CSV, Excel, or connect directly to your database via Apify API.

Input Parameters

ParameterTypeRequiredDefaultDescription
subredditsArray of StringsConditional*[]List of subreddit names to scrape (without "r/" prefix). Example: ["javascript", "webdev"]
searchQueryStringConditional*""Search query to execute. If provided with subreddits, search is restricted to those subreddits. Example: "best practices"
maxPostsIntegerNo25Maximum number of posts to scrape. Minimum: 1
sortByEnumNo"hot"Sort order for posts. Options: hot, new, top, rising
includeCommentsBooleanNofalseWhether to scrape comments for each post. Increases runtime significantly.

*Note: At least one of subreddits or searchQuery must be provided.

Example Input Configuration

{
"subreddits": ["python", "learnpython", "datascience"],
"searchQuery": "pandas dataframe",
"maxPosts": 100,
"sortBy": "top",
"includeComments": true
}

This configuration searches for "pandas dataframe" across three Python-related subreddits, retrieves the top 100 posts by score, and includes all comments for each post.

Example Output Data

{
"title": "Best way to merge multiple DataFrames in pandas?",
"author": "data_enthusiast_42",
"score": 847,
"upvoteRatio": 0.94,
"commentCount": 63,
"subreddit": "datascience",
"text": "I have 5 DataFrames with different columns but the same index. What's the most efficient way to merge them? I've tried pd.concat() but wondering if there's a better approach for memory efficiency.",
"url": "https://old.reddit.com/r/datascience/comments/1b2c3d4/best_way_to_merge_multiple_dataframes_in_pandas/",
"createdAt": "2026-02-12T14:23:17.000Z",
"flair": "Discussion",
"comments": [
{
"author": "pandas_expert",
"body": "Use pd.concat() with axis=1 for column-wise concatenation. If you have millions of rows, consider using pd.merge() sequentially with join='outer' to preserve all data.",
"score": 124,
"createdAt": "2026-02-12T14:45:33.000Z",
"depth": 0
},
{
"author": "data_scientist_pro",
"body": "For large datasets, look into Dask or Polars. They handle out-of-memory operations much better than pandas.",
"score": 89,
"createdAt": "2026-02-12T15:12:08.000Z",
"depth": 0
},
{
"author": "data_enthusiast_42",
"body": "Thanks! I'll try pd.concat with axis=1. My datasets are around 2M rows each so memory is definitely a concern.",
"score": 34,
"createdAt": "2026-02-12T15:34:22.000Z",
"depth": 1
}
]
}

This scraper extracts publicly available data from Reddit using the old.reddit.com interface. All scraped content is accessible without authentication to any internet user. You are responsible for ensuring your use of scraped data complies with Reddit's Terms of Service, applicable laws, and regulations including GDPR, CCPA, and other data protection frameworks. The tool is designed for research, market analysis, and personal projects.

Always respect rate limits and implement reasonable delays between requests to avoid overloading Reddit's servers. Do not use scraped data for harassment, spam, or violating individual privacy rights. When publishing research or analysis based on Reddit data, consider anonymizing usernames and following academic ethical guidelines. The actor developer assumes no liability for misuse of this tool or violation of third-party terms of service.

Pricing and Performance

This actor uses Apify's residential proxies (US) to avoid IP blocking and ensure reliable scraping. Typical costs:

  • Without comments: 100 posts cost approximately 0.01-0.02 USD (10-20 compute units)
  • With comments: 100 posts with comments cost approximately 0.05-0.10 USD (50-100 compute units), depending on comment volume

Apify provides free monthly credits for new users. Runtime scales linearly with the number of posts and comments. Scraping 1,000 posts without comments typically completes in 5-10 minutes. With comments enabled, expect 15-30 minutes depending on thread sizes.

The actor uses conservative concurrency (1 concurrent request) to minimize detection risk and ensure stable operation. Residential proxies are recommended for consistent access to Reddit without triggering anti-bot measures.

Frequently Asked Questions

Can I scrape private or restricted subreddits?

No. This scraper only accesses publicly available content visible on old.reddit.com without authentication. Private subreddits, age-restricted content, and posts requiring login cannot be scraped.

Why is upvoteRatio sometimes null?

The upvote ratio is not always present in old.reddit.com's HTML, particularly for older posts or certain subreddit configurations. When the ratio is unavailable in the source HTML, the field returns null.

How do I avoid getting blocked by Reddit?

The actor uses residential proxies by default, which significantly reduces blocking risk. Additionally, keeping maxPosts under 500 per run and spacing out multiple runs by several hours helps maintain reliability. Avoid running dozens of concurrent actor instances targeting the same subreddits.

Can I scrape historical posts from specific dates?

This scraper works with Reddit's native sort options (hot, new, top, rising). To target specific date ranges, use the top sort option combined with search queries, or run the scraper periodically to build a historical dataset over time. Reddit's search and sort features do not support precise date filtering via old.reddit.com.

What happens if a post or comment is deleted during scraping?

Deleted or removed content typically shows null for the author field and may have placeholder text like [deleted] or [removed] in the body. The scraper captures whatever is visible in the HTML at the time of extraction.

Explore our other specialized Reddit scrapers for comprehensive social media intelligence:


Need help? Contact support at vhubsystems@gmail.com or visit Apify documentation for integration guides.

Proxy Recommendations

For best results, use residential or datacenter proxies:

  • 🔗 Smartproxy — Residential proxies with 195+ locations. 50% revenue share for referrals. Best for social media and SERP scraping.
  • 🔗 Bright Data — Enterprise-grade proxies used by Fortune 500. Up to 25% commission. Best for large-scale extraction.

💡 These actors work out-of-the-box with Apify's built-in proxy pool too.

🛒 Get More Tools

Looking for ready-made automation kits?