Reddit Posts & Comments Scraper
Pricing
$19.99/month + usage
Reddit Posts & Comments Scraper
Extract Reddit posts and comments from any subreddit, search query, or user profile. Collect titles, scores, comments, media URLs, and 40+ fields per-post. Supports multiple subreddits, advanced filtering by score, flair, domain, and post type, plus optional comment enrichment.
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ParseForge
Actor stats
2
Bookmarked
42
Total users
5
Monthly active users
15 hours ago
Last modified
Categories
Share

📱 Reddit Posts and Comments Scraper
🚀 Pull Reddit posts and full comment trees in minutes. Subreddits, search, user profiles, multi-subreddits, post comments. 40+ fields per post. No login.
🕒 Last updated: 2026-05-08 · 📊 40+ fields per post · 🔍 5 modes · 🚫 No auth required
Pull live Reddit posts and comments from any subreddit, search query, user profile, multi-subreddit feed, or specific post. The actor walks Reddit's public API surface for the mode you pick and returns one structured record per post or comment ready for trend analysis, content research, brand listening, or community studies.
Every run fetches data live so you get the current state of Reddit at run time, not a stale dump. Records include title, full text, author, score, upvote ratio, comment count, timestamps, awards, flair, domain, and (with comment mode or includeComments) the full nested comment tree.
| 👥 Built for | 🎯 Primary use cases |
|---|---|
| Brand and social listening | Track Reddit mentions of your brand |
| Content research teams | Mine Reddit for content ideas and angles |
| Market researchers | Study sentiment and discussion topics |
| Crisis monitoring | Watch for reputation events in real time |
| Marketing and growth | Identify trending topics for content marketing |
| Researchers | Study community dynamics and discussion patterns |
📋 What the Reddit Posts and Comments Scraper does
- 🎯 Five scraping modes. Subreddit, search, user profile, multi-subreddit, or specific post comments.
- 📊 Rich metadata. Title, body, score, upvote ratio, comment count, awards, flair, domain.
- 💬 Comment trees. Optional nested comment tree (with
includeComments) or comments-only mode. - 👤 Author info. Username, post karma, account age (where Reddit exposes them).
- 🔍 Filter by metadata. Min score, post type (text / link / image / video), flair, domain.
- 🗓️ Time filters. All time, year, month, week, day, hour.
The scraper accepts a mode plus the matching inputs (subreddit name, search query, username, list of subreddits, or post URL). It walks Reddit's public surface and pushes structured records to the dataset as posts are processed.
💡 Why it matters: Reddit is the largest public discussion forum on the web but its UI lacks bulk export. A live, structured pull beats manual scraping for brand listening, content research, and trend analysis.
🎬 Full Demo
🚧 Coming soon: a 3-minute walkthrough showing setup, a live run, and how to pipe results into Slack via Apify integrations.
⚙️ Input
| Field | Type | Name | Description |
|---|---|---|---|
mode | enum | Mode | subreddit, search, user, multi, or comments. |
maxItems | integer | Max Items | Free users: limited to 10 items (preview). Paid users: optional, max 1,000,000. |
subreddit | string | Subreddit | Subreddit name without r/ prefix (e.g. technology, programming). |
subreddits | string | Subreddits List | Comma-separated subreddit list for multi mode. |
searchQuery | string | Search Query | Free-text search query for search mode. |
searchInSubreddit | string | Search Within Subreddit | Restrict search to a specific subreddit. |
username | string | Username | Reddit username (without u/) for user mode. |
postUrl | string | Post URL | Direct Reddit post URL for comments mode. |
sort | enum | Sort | hot, new, top, rising, controversial, relevance. |
timeFilter | enum | Time Filter | all, year, month, week, day, hour. |
minScore | integer | Min Score | Lower bound on post score. |
includeComments | boolean | Include Comments | When true, fetch top comments per post. |
Example 1. Hot posts from r/technology, top scored.
{"mode": "subreddit","subreddit": "technology","sort": "top","timeFilter": "week","maxItems": 50}
Example 2. All comments from a specific post.
{"mode": "comments","postUrl": "https://www.reddit.com/r/programming/comments/abc123/example_post/","maxItems": 200}
⚠️ Good to Know: in
commentsmode, the dataset returns one record per comment (not per post). Switch tosubredditmode withincludeComments: trueif you want both posts and their top comments in the same run.
📊 Output
The dataset returns one structured record per post or comment. Each record carries identifiers, content, author info, engagement metrics, timestamps, and a back-reference URL. Consume the dataset as JSON, CSV, Excel, XML, or RSS via the Apify console or API.
🧾 Schema
| Field | Type | Example |
|---|---|---|
🆔 id | string | t3_abc123 |
📝 title | string | Show HN: Open-source alternative to Notion |
📃 selfText | string | Hey everyone, I just launched... |
👤 author | string | developer_jane |
📱 subreddit | string | technology |
📊 score | number | 4520 |
💬 numComments | number | 312 |
📈 upvoteRatio | number | 0.96 |
🏷️ linkFlairText | string | Discussion |
🌐 domain | string | self.technology |
🔞 over18 | boolean | false |
🎯 spoiler | boolean | false |
📌 stickied | boolean | false |
📝 isSelf | boolean | true |
🏆 awards | number | 12 |
🔗 url | string (url) | https://example.com/launch |
🔗 permalink | string (url) | https://www.reddit.com/r/technology/comments/abc123/... |
📅 createdAt | ISO datetime | 2026-04-12T14:30:00.000Z |
📅 scrapedAt | ISO datetime | 2026-05-08T12:00:00.000Z |
📦 Sample records
1. Typical text post (with discussion)
{"id": "t3_abc123","title": "Show HN: Open-source alternative to Notion","selfText": "Hey everyone, I just launched a new open-source note-taking app...","author": "developer_jane","subreddit": "technology","score": 4520,"numComments": 312,"upvoteRatio": 0.96,"linkFlairText": "Discussion","domain": "self.technology","over18": false,"isSelf": true,"awards": 12,"permalink": "https://www.reddit.com/r/technology/comments/abc123/show_hn_open_source_alternative_to_notion/","createdAt": "2026-04-12T14:30:00.000Z","scrapedAt": "2026-05-08T12:00:00.000Z"}
2. External link post
{"id": "t3_def456","title": "OpenAI announces GPT-7","selfText": "","author": "ai_news_bot","subreddit": "technology","score": 12450,"numComments": 1820,"upvoteRatio": 0.93,"linkFlairText": "Software","domain": "openai.com","isSelf": false,"url": "https://openai.com/blog/gpt-7-launch","permalink": "https://www.reddit.com/r/technology/comments/def456/openai_announces_gpt7/","createdAt": "2026-05-01T09:00:00.000Z","scrapedAt": "2026-05-08T12:00:00.000Z"}
3. Comment record (sparse)
{"id": "t1_xyz789","author": "anon_commenter","subreddit": "programming","selfText": "Great write-up. I had similar issues with the v3 API.","score": 47,"permalink": "https://www.reddit.com/r/programming/comments/abc123/example_post/xyz789/","createdAt": "2026-05-07T18:00:00.000Z","scrapedAt": "2026-05-08T12:00:00.000Z"}
✨ Why choose this Actor
| Capability | |
|---|---|
| 🎯 | Built for the job. Scoped specifically to Reddit so you skip the parser engineering entirely. |
| 🔖 | Structured output. Clean, typed fields ready for analysis, dashboards, or downstream pipelines. |
| ⚡ | Fast. Optimized request patterns return results in seconds, not minutes. |
| 🔁 | Always fresh. Every run pulls live data, so the dataset reflects Reddit as of run time. |
| 🌐 | No infra to manage. Apify handles proxies, retries, scaling, scheduling, and storage. |
| 🛡️ | Reliable. Battle-tested across many runs and edge cases, with graceful error handling. |
| 🚫 | No code required. Configure in the UI, run from CLI, schedule via cron, or call from any language with the Apify SDK. |
📊 Production-grade structured Reddit data without the engineering overhead of building and maintaining your own scraper.
📈 How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| ⭐ Reddit Posts and Comments Scraper (this Actor) | $5 free credit, then pay-per-use | All public Reddit content | Live per run | 5 modes + sort + score filters | ⚡ 2 min |
| Build your own scraper | Engineering hours | Full once built | Whenever you maintain it | Custom code | 🐢 Days to weeks |
| Reddit official API | Free with limits | Full | Live | Limited | ⏳ Hours of integration |
| Manual Reddit search | Hours per check | Limited | Stale | Manual | 🕒 Variable |
Pick this Actor when you want broad coverage, source-native filtering, and no pipeline maintenance.
🚀 How to use
- 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
- 🌐 Open the Actor. Go to the Reddit Posts and Comments Scraper page on the Apify Store.
- 🎯 Pick mode. Choose subreddit, search, user, multi, or comments mode and set the matching inputs.
- 🚀 Run it. Click Start and let the Actor collect your data.
- 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.
⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.
💼 Business use cases
🌟 Beyond business use cases
Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
🔌 Automating Reddit Posts and Comments Scraper
This Actor exposes a REST endpoint, so you can drive it from any language or workflow tool.
- Node.js - call it via the Apify JS SDK.
- Python - call it via the Apify Python SDK.
- REST - hit it directly through the Apify v2 API.
Schedules. Use Apify Scheduler to capture daily snapshots of trending posts. Combine with the Apify dataset diff tools to track new posts and engagement velocity between runs.
❓ Frequently Asked Questions
💳 Do I need a paid Apify plan to run this actor?
No. You can start right now on the free Apify plan, which includes $5 in monthly credit. That is enough to run the scraper several times and explore the output. Paid plans unlock higher item caps, more concurrent runs, and larger datasets. Create a free Apify account here.
🚨 What happens if my run fails or returns no results?
Failed runs are not charged. If Reddit changes its API surface, proxies get rate-limited, or your filters match nothing, re-run the actor or open our contact form and we will look into it.
📏 How many items can I scrape per run?
Free users are limited to 10 items per run so you can preview the output. Paid users can raise maxItems up to 1,000,000 per run.
🕒 How fresh is the data?
Every run fetches live data at the moment of execution. There is no cache or delay: records reflect what Reddit returned at run time. Schedule the actor to maintain a rolling snapshot.
🧑💻 Can I call this actor from my own code?
Yes. Apify exposes every actor as a REST endpoint and ships first-class SDKs for Node.js and Python. You can start a run, read the dataset, and handle webhooks from your own app in a few lines.
📤 How do I export the data?
Every Apify dataset can be downloaded in one click as CSV, JSON, JSONL, Excel, HTML, XML, or RSS. You can also pull results programmatically via the Apify API or stream into BigQuery, S3, and other destinations through built-in integrations.
📅 Can I schedule the actor to run automatically?
Yes. Use the Apify scheduler to run the actor on any cadence, from hourly to monthly. Results are saved to your dataset and can be delivered to webhooks, email, Slack, cloud storage, or automation tools such as Zapier and Make.
🏪 Can I use the data commercially?
Yes. The scraped data is yours to use in your own internal pipelines, products, and reports, subject to Reddit's API terms of use and applicable privacy laws.
💼 Which plan should I pick for production use?
Apify's Starter and Scale plans are designed for production workloads. They give you faster instances, more concurrent runs, and higher proxy quotas. Pick the plan that matches your dataset size and refresh cadence.
🛠️ The data I need is not in the output. Can you add it?
Most likely yes. Open the contact form and tell us which field you need. We add fields all the time when there is a clear use case and the source page exposes the data.
⚖️ Is scraping Reddit legal?
This Actor only collects data from publicly accessible Reddit pages, the same content any visitor can read. Public web scraping is generally legal in most jurisdictions for non-personal data, but laws vary by country and use case. You are responsible for compliance with Reddit's API Terms of Use and applicable law.
🔌 Integrate with any app
Reddit Posts and Comments Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get run notifications in your channels
- Airbyte - Pipe results into your warehouse
- GitHub - Trigger runs from commits and releases
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes.
🔗 Recommended Actors
- 📱 Reddit Posts Scraper - Lightweight Reddit posts scraper (no comments)
- 🐦 X (Twitter) Scraper - Tweets with engagement and author info
- 💼 LinkedIn Posts Scraper - LinkedIn posts from profiles and companies
- 📸 Instagram Posts Scraper - Instagram posts and reels with metadata
- 📱 TikTok Video Scraper - TikTok videos with engagement metrics
💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
🆘 Need Help? Open our contact form to request a new scraper, propose a custom project, or report an issue.
⚠️ Disclaimer. This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Reddit. All trademarks mentioned are the property of their respective owners. The scraper accesses only publicly available pages and is intended for legitimate research, analytics, and brand-listening use. Users are responsible for compliance with Reddit's API Terms of Use, applicable privacy laws, and any data-protection rules that apply.