Reddit Posts & Comments Scraper avatar

Reddit Posts & Comments Scraper

Pricing

Pay per event

Go to Apify Store
Reddit Posts & Comments Scraper

Reddit Posts & Comments Scraper

Extract Reddit posts and comments from any subreddit, search query, or user profile. Collect titles, scores, comments, media URLs, and 40+ fields per-post. Supports multiple subreddits, advanced filtering by score, flair, domain, and post type, plus optional comment enrichment.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

2

Bookmarked

72

Total users

18

Monthly active users

4 days ago

Last modified

Share

ParseForge Banner

📱 Reddit Posts and Comments Scraper

🚀 Pull Reddit posts and full comment trees in minutes. Subreddits, search, user profiles, multi-subreddits, post comments. 40+ fields per post. No login.

🕒 Last updated: 2026-05-08 · 📊 40+ fields per post · 🔍 5 modes · 🚫 No auth required

Pull live Reddit posts and comments from any subreddit, search query, user profile, multi-subreddit feed, or specific post. The actor walks Reddit's public API surface for the mode you pick and returns one structured record per post or comment ready for trend analysis, content research, brand listening, or community studies.

Every run fetches data live so you get the current state of Reddit at run time, not a stale dump. Records include title, full text, author, score, upvote ratio, comment count, timestamps, awards, flair, domain, and (with comment mode or includeComments) the full nested comment tree.

👥 Built for🎯 Primary use cases
Brand and social listeningTrack Reddit mentions of your brand
Content research teamsMine Reddit for content ideas and angles
Market researchersStudy sentiment and discussion topics
Crisis monitoringWatch for reputation events in real time
Marketing and growthIdentify trending topics for content marketing
ResearchersStudy community dynamics and discussion patterns

📋 What the Reddit Posts and Comments Scraper does

  • 🎯 Five scraping modes. Subreddit, search, user profile, multi-subreddit, or specific post comments.
  • 📊 Rich metadata. Title, body, score, upvote ratio, comment count, awards, flair, domain.
  • 💬 Comment trees. Optional nested comment tree (with includeComments) or comments-only mode.
  • 👤 Author info. Username, post karma, account age (where Reddit exposes them).
  • 🔍 Filter by metadata. Min score, post type (text / link / image / video), flair, domain.
  • 🗓️ Time filters. All time, year, month, week, day, hour.

The scraper accepts a mode plus the matching inputs (subreddit name, search query, username, list of subreddits, or post URL). It walks Reddit's public surface and pushes structured records to the dataset as posts are processed.

💡 Why it matters: Reddit is the largest public discussion forum on the web but its UI lacks bulk export. A live, structured pull beats manual scraping for brand listening, content research, and trend analysis.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing setup, a live run, and how to pipe results into Slack via Apify integrations.


⚙️ Input

FieldTypeNameDescription
modeenumModesubreddit, search, user, multi, or comments.
maxItemsintegerMax ItemsFree users: limited to 10 items (preview). Paid users: optional, max 1,000,000.
subredditstringSubredditSubreddit name without r/ prefix (e.g. technology, programming).
subredditsstringSubreddits ListComma-separated subreddit list for multi mode.
searchQuerystringSearch QueryFree-text search query for search mode.
searchInSubredditstringSearch Within SubredditRestrict search to a specific subreddit.
usernamestringUsernameReddit username (without u/) for user mode.
postUrlstringPost URLDirect Reddit post URL for comments mode.
sortenumSorthot, new, top, rising, controversial, relevance.
timeFilterenumTime Filterall, year, month, week, day, hour.
minScoreintegerMin ScoreLower bound on post score.
includeCommentsbooleanInclude CommentsWhen true, fetch top comments per post.

Example 1. Hot posts from r/technology, top scored.

{
"mode": "subreddit",
"subreddit": "technology",
"sort": "top",
"timeFilter": "week",
"maxItems": 50
}

Example 2. All comments from a specific post.

{
"mode": "comments",
"postUrl": "https://www.reddit.com/r/programming/comments/abc123/example_post/",
"maxItems": 200
}

⚠️ Good to Know: in comments mode, the dataset returns one record per comment (not per post). Switch to subreddit mode with includeComments: true if you want both posts and their top comments in the same run.


📊 Output

The dataset returns one structured record per post or comment. Each record carries identifiers, content, author info, engagement metrics, timestamps, and a back-reference URL. Consume the dataset as JSON, CSV, Excel, XML, or RSS via the Apify console or API.

🧾 Schema

FieldTypeExample
🆔 idstringt3_abc123
📝 titlestringShow HN: Open-source alternative to Notion
📃 selfTextstringHey everyone, I just launched...
👤 authorstringdeveloper_jane
📱 subredditstringtechnology
📊 scorenumber4520
💬 numCommentsnumber312
📈 upvoteRationumber0.96
🏷️ linkFlairTextstringDiscussion
🌐 domainstringself.technology
🔞 over18booleanfalse
🎯 spoilerbooleanfalse
📌 stickiedbooleanfalse
📝 isSelfbooleantrue
🏆 awardsnumber12
🔗 urlstring (url)https://example.com/launch
🔗 permalinkstring (url)https://www.reddit.com/r/technology/comments/abc123/...
📅 createdAtISO datetime2026-04-12T14:30:00.000Z
📅 scrapedAtISO datetime2026-05-08T12:00:00.000Z

📦 Sample records

1. Typical text post (with discussion)

{
"id": "t3_abc123",
"title": "Show HN: Open-source alternative to Notion",
"selfText": "Hey everyone, I just launched a new open-source note-taking app...",
"author": "developer_jane",
"subreddit": "technology",
"score": 4520,
"numComments": 312,
"upvoteRatio": 0.96,
"linkFlairText": "Discussion",
"domain": "self.technology",
"over18": false,
"isSelf": true,
"awards": 12,
"permalink": "https://www.reddit.com/r/technology/comments/abc123/show_hn_open_source_alternative_to_notion/",
"createdAt": "2026-04-12T14:30:00.000Z",
"scrapedAt": "2026-05-08T12:00:00.000Z"
}

2. External link post

{
"id": "t3_def456",
"title": "OpenAI announces GPT-7",
"selfText": "",
"author": "ai_news_bot",
"subreddit": "technology",
"score": 12450,
"numComments": 1820,
"upvoteRatio": 0.93,
"linkFlairText": "Software",
"domain": "openai.com",
"isSelf": false,
"url": "https://openai.com/blog/gpt-7-launch",
"permalink": "https://www.reddit.com/r/technology/comments/def456/openai_announces_gpt7/",
"createdAt": "2026-05-01T09:00:00.000Z",
"scrapedAt": "2026-05-08T12:00:00.000Z"
}

3. Comment record (sparse)

{
"id": "t1_xyz789",
"author": "anon_commenter",
"subreddit": "programming",
"selfText": "Great write-up. I had similar issues with the v3 API.",
"score": 47,
"permalink": "https://www.reddit.com/r/programming/comments/abc123/example_post/xyz789/",
"createdAt": "2026-05-07T18:00:00.000Z",
"scrapedAt": "2026-05-08T12:00:00.000Z"
}

✨ Why choose this Actor

Capability
🎯Built for the job. Scoped specifically to Reddit so you skip the parser engineering entirely.
🔖Structured output. Clean, typed fields ready for analysis, dashboards, or downstream pipelines.
Fast. Optimized request patterns return results in seconds, not minutes.
🔁Always fresh. Every run pulls live data, so the dataset reflects Reddit as of run time.
🌐No infra to manage. Apify handles proxies, retries, scaling, scheduling, and storage.
🛡️Reliable. Battle-tested across many runs and edge cases, with graceful error handling.
🚫No code required. Configure in the UI, run from CLI, schedule via cron, or call from any language with the Apify SDK.

📊 Production-grade structured Reddit data without the engineering overhead of building and maintaining your own scraper.


📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ Reddit Posts and Comments Scraper (this Actor)$5 free credit, then pay-per-useAll public Reddit contentLive per run5 modes + sort + score filters⚡ 2 min
Build your own scraperEngineering hoursFull once builtWhenever you maintain itCustom code🐢 Days to weeks
Reddit official APIFree with limitsFullLiveLimited⏳ Hours of integration
Manual Reddit searchHours per checkLimitedStaleManual🕒 Variable

Pick this Actor when you want broad coverage, source-native filtering, and no pipeline maintenance.


🚀 How to use

  1. 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Reddit Posts and Comments Scraper page on the Apify Store.
  3. 🎯 Pick mode. Choose subreddit, search, user, multi, or comments mode and set the matching inputs.
  4. 🚀 Run it. Click Start and let the Actor collect your data.
  5. 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


💼 Business use cases

📊 Brand and social listening

  • Monitor brand mentions across subreddits
  • Track competitor discussions and sentiment
  • Build crisis-response dashboards
  • Map share-of-voice in your industry

🏢 Content and marketing

  • Mine Reddit for content ideas and angles
  • Identify high-engagement post formats by subreddit
  • Build content libraries from top-performing posts
  • Track trending topics for marketing campaigns

🎯 Market research

  • Study customer sentiment around products
  • Surface real-world use cases and pain points
  • Build qualitative research from authentic discussions
  • Track community-driven product feedback

🛠️ Engineering and product

  • Prototype social-listening products without owning a crawler
  • Replace fragile in-house Reddit scrapers
  • Wire datasets into your apps via the Apify API or webhooks
  • Skip the proxy, retry, and parsing maintenance entirely

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🤝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

🧪 Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

🔌 Automating Reddit Posts and Comments Scraper

This Actor exposes a REST endpoint, so you can drive it from any language or workflow tool.

Schedules. Use Apify Scheduler to capture daily snapshots of trending posts. Combine with the Apify dataset diff tools to track new posts and engagement velocity between runs.


❓ Frequently Asked Questions

🔌 Integrate with any app

Reddit Posts and Comments Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications in your channels
  • Airbyte - Pipe results into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes.


💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


🆘 Need Help? Open our contact form to request a new scraper, propose a custom project, or report an issue.


⚠️ Disclaimer. This Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Reddit. All trademarks mentioned are the property of their respective owners. The scraper accesses only publicly available pages and is intended for legitimate research, analytics, and brand-listening use. Users are responsible for compliance with Reddit's API Terms of Use, applicable privacy laws, and any data-protection rules that apply.