Reddit Post Scraper avatar
Reddit Post Scraper

Pricing

$20.00/month + usage

Go to Apify Store
Reddit Post Scraper

Reddit Post Scraper

Developed by

datawizards

datawizards

Maintained by Community

The Reddit Post Scraper Apify Actor extracts detailed Reddit post and comment data in JSON, ideal for social media analysis, market research, and SEO insights. Supports customizable limits, residential proxies, and scalable scraping. Built by DataWizards for fast, reliable data collection.

0.0 (0)

Pricing

$20.00/month + usage

0

13

2

Last modified

4 months ago

📦 Reddit Post Scraper · Apify Actor

Effortlessly extract detailed Reddit post and comment data from specific subreddits or post URLs in clean, structured JSON format. Perfect for social media analysis, sentiment tracking, or market research.
Built and maintained by DataWizards.


📌 What Is Reddit Post Scraper?

The Reddit Post Scraper Apify Actor enables you to scrape posts and their associated comments from Reddit subreddits or individual post URLs. It captures key details like post titles, descriptions, authors, upvotes, comment counts, and comment threads, making it ideal for researchers, marketers, and developers seeking insights from Reddit communities.

No login or manual browsing is required, and the actor integrates seamlessly with Apify’s proxy system for reliable, scalable scraping.


🧠 Key Features

  • Comprehensive Post Data – Extracts post title, description, author, URL, upvotes, and comment count.
  • Comment Extraction – Captures comment authors and their text for in-depth analysis.
  • Customizable Limits – Set the number of posts to scrape with the itemLimit parameter.
  • Proxy Support – Uses Apify Proxy (including RESIDENTIAL IPs) for robust scraping.
  • Structured JSON Output – Clean, easy-to-use data for analytics or integration.
  • Lightweight & Scalable – Optimized for fast performance and large-scale data collection.

🛠️ Input Schema

To use the Reddit Post Scraper, provide the following input in JSON format:

{
"urls": [
"https://www.reddit.com/r/SEO"
],
"itemLimit": 10,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": [
"RESIDENTIAL"
]
}
}

Input Parameters

  • urls: Array of Reddit subreddit URLs (e.g., https://www.reddit.com/r/SEO) or specific post URLs.
  • itemLimit: Maximum number of posts to scrape (e.g., 10 for 10 posts).
  • proxyConfiguration:
    • useApifyProxy: Set to true to enable Apify Proxy (required for reliable scraping).
    • apifyProxyGroups: Use "RESIDENTIAL" for high-quality, low-block results.

📤 Output Example

The actor returns structured JSON data, including post details and associated comments:

[
{
"TITLE": "What's even the point in trying anymore?",
"DESCRIPTION": "SEO is just one part of a much wider role for me and I'm by no means an expert, so I rarely look on this sub - apologies if the same thing has been said 1,000 times, but I need to vent. When doing some year-on-year analysis of web analytics I noticed our site had taken a fairly decent hit in organic search traffic, so I pulled a load of Search Console data and started looking at terms where the click have dropped significantly compared to the same period last year. Time and time again it showed that average position had improved and impressions had improved. So I put these terms into Google in an incognito window and my god, I hadn't realised just how insane the results pages have got. One term (a very big one for us) ranks on average in the top 4 and has had a 66% increase in impressions year-on-year, but clicks are down 50%. I looked at the SERP and this was the order of the page: AI Overview Videos People Also Ask Find Results On Businesses map/listings Images Fucking hell, the first 10 items feature 6 SERPs features and only 4 organic results. I'm by no means saying it's the entire reason organic traffic is down, but it just feels like even when we do everything right that still isn't enough. It's just so demotivating.",
"AUTHOR": "jpeach17",
"URL": "https://www.reddit.com/r/SEO/comments/1l73oxl/whats_even_the_point_in_trying_anymore/",
"UPVOTES": "168",
"COMMENT_COUNT": "120",
"comments": [
{
"AUTHOR": "[deleted]",
"TEXT": "I don’t know if you’ve ever seen ready player one but there’s this scene where Ben mendelsohn’s character (evil ceo) says that they estimate they can fill up 86.3% of the viewers area of vision with ads before the viewer goes into a seizure. The viewer is in a virtual reality world and so they’re referring to stuffing the above space with ads and just before reaching the threshold that causes seizures. That’s what Google search is turning into. Now I need to watch this movie. All we're waiting for now is for the AI overview itself to start showing sponsored results (ads)."
},
{
"AUTHOR": "ManagedNerds",
"TEXT": "Now I need to watch this movie. All we're waiting for now is for the AI overview itself to start showing sponsored results (ads)."
},
{
"AUTHOR": "mite189",
"TEXT": "Good SEO practices actually land your brand into the AI overview. Unless you’re a purely informational site getting revenue through ads or affiliates. In that case unfortunately your time is up. But for real businesses selling actual goods or services, SEO is just as important as before. My brands rank in the AI overview. It’s not as bleak as you may think :) And how are your CTR's looking from those teeny, weeny little AI overview links?"
}
]
}
]

🚀 Use Cases

  • 📊 Social Media Analysis – Track trends, sentiments, and discussions in specific Reddit communities.
  • 🕵️ Market Research – Understand user opinions on products, services, or industries.
  • 🤖 Content Monitoring – Monitor subreddit activity for brand mentions or competitor insights.
  • 📈 SEO Insights – Analyze discussions in SEO-related subreddits to inform strategies.
  • 🧠 Sentiment Analysis – Extract comments for natural language processing or sentiment modeling.

✅ Best Practices

  • Use Residential Proxies: Enable RESIDENTIAL proxy groups to minimize blocks and ensure reliable scraping.
  • Set Reasonable Limits: Use itemLimit to control the number of posts scraped and avoid overloading Reddit’s servers.
  • Validate URLs: Ensure subreddit or post URLs are correct to prevent errors.
  • Batch Requests: For large-scale scraping, break requests into smaller batches to maintain performance.
  • Respect Rate Limits: Avoid excessive requests in a short period to prevent temporary bans.
  • Monitor Outputs: Regularly check JSON outputs for consistency and completeness.

🛠️ Advanced Configuration

  • Filtering Posts: Specify individual post URLs in the urls array for targeted scraping.
  • Scaling Up: Increase itemLimit for broader subreddit scraping, but monitor proxy usage to avoid rate-limiting.
  • Custom Parsing: Contact DataWizards for custom fields (e.g., post timestamps, subreddit metadata) or tailored output formats.

🤝 Support

Need help with setup, custom outputs, or integration?

DataWizards is ready to assist!

📩 Email: hello.datawizard@gmail.com
✉️ Subject: Reddit Post Scraper – Support Request
🔗 Connect: linkedin.com/in/data-wizards-aa8080342


🧰 Need Something More?

Looking for custom scraping logic, API integrations, or simplified data formats? DataWizards specializes in tailored data solutions to meet your needs.

Contact us for personalized support.
💡 DataWizards = Fast, scalable, and reliable data extraction.


💬 Feedback & Bugs

Found a bug or have a feature request?

📧 Email: hello.datawizard@gmail.com
🛠️ Subject: Bug Report – Reddit Post Scraper


🏁 Start scraping Reddit smarter with Reddit Post Scraper — your go-to tool for unlocking insights from Reddit’s vibrant communities.