š„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images
Pricing
Pay per event
š„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images
Extract Reddit posts and comments as LLM-ready Markdown. No API key needed. Direct n8n/Make integrationāconnect output to AI nodes instantly. 20x faster than browser scrapers. Perfect for lead gen, product validation, and market research workflows.
Pricing
Pay per event
Rating
0.0
(0)
Developer

ClearPath
Actor stats
1
Bookmarked
5
Total users
4
Monthly active users
2 days ago
Last modified
Categories
Share
š„ Reddit Scraper for LLM & RAG | Posts, Comments & Images (2025)
The fastest, most cost-effective way to extract Reddit data for AI workflows. No browser overhead, no rate limit headachesājust clean JSON + LLM-ready Markdown output that plugs directly into n8n, Make, or any automation platform.
- ā” Blazing fast - Pure HTTP requests, no browser simulation
- š° Incredibly cheap - $0.30 for 100 posts with comments (flat rate, any comment count)
- š¤ LLM-optimized - Markdown output ready for GPT, Claude, Gemini
- š n8n native - Designed for workflow automation
Why This Actor?
Most Reddit scrapers give you raw JSON that needs heavy transformation before LLMs can use it. This Actor outputs pre-formatted Markdown alongside structured JSONāfeed it directly to an AI node without writing a single line of code.
Perfect for n8n Workflows
Lead Generation from Pain Points
reddit-to-llm (search: "looking for", "need help with")ā LLM qualify leadsā CRM/Email sequence
Find people actively seeking solutions you provide.
Product Validation Pipeline
Webhook (new idea)ā reddit-to-llm (search related subreddits)ā LLM analyze: demand signals, objections, existing solutionsā Structured report
Before building, validate if people actually want it. The markdown format lets LLMs deeply analyze threaded discussions.
ā” Key Features
Lightning Fast Extraction
- No browser overhead - Direct data extraction, not Puppeteer/Playwright
- 20 concurrent requests - Process multiple posts simultaneously
- Automatic deduplication - No duplicate posts across modes
šÆ Three Collection Modes
- Search - Global or subreddit-restricted keyword search
- Subreddit Feeds - Hot, new, top, rising posts
- Direct URLs - Scrape specific posts by URL
š¤ LLM-Ready Output
- Markdown field - Formatted for direct AI consumption
- Flat comments with depth - Easy to process, depth signals conviction
- OP markers - Know when the author replies
šø Optional Image Extraction
- Preview images, galleries, direct links (i.redd.it, imgur)
- Stored to Apify Key-Value Store with public URLs
- Ready for vision models (GPT-4V, Claude)
š° Pricing (Pay Per Event)
Transparent, predictable pricing. Only pay for what you extract.
| Event | Price |
|---|---|
| Post scraped (without comments) | $0.001 |
| Post scraped (with comments) | $0.003 |
| Image scraped | $0.0005 |
Flat rate per post - Whether a post has 1 comment or 500 comments, the price is the same ($0.003).
Cost Examples
| Scenario | Posts | Comments? | Images | Total Cost |
|---|---|---|---|---|
| Posts only | 100 | No | 0 | $0.10 |
| Posts + comments | 100 | Yes (any count) | 0 | $0.30 |
| Deep dive | 500 | Yes (any count) | 0 | $1.50 |
| With images | 100 | Yes | 500 | $0.55 |
Cost optimization tips:
- Set
includeComments: falseif you only need post titles/content (3x cheaper) - Comment count doesn't affect price - get as many as you need!
- Filter by subreddit to avoid irrelevant posts
Input Configuration
Search Mode
| Parameter | Type | Default | Description |
|---|---|---|---|
searchKeywords | string[] | [] | Keywords to search (joined with spaces) |
searchInSubreddits | string[] | [] | Limit search to specific subreddits |
searchSort | enum | relevance | relevance, new, top, comments |
searchLimit | integer | 25 | Max posts (1-1000) |
Subreddit Feed Mode
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | string[] | ["indiehackers"] | Subreddits to scrape |
subredditSort | enum | hot | hot, new, top, rising |
subredditTimeFilter | enum | - | For top: hour, day, week, month, year, all |
subredditLimit | integer | 25 | Max posts per subreddit (1-1000) |
Direct URLs Mode
| Parameter | Type | Default | Description |
|---|---|---|---|
postUrls | string[] | [] | Reddit post URLs or redd.it short links |
Output Settings
| Parameter | Type | Default | Description |
|---|---|---|---|
includeComments | boolean | true | Fetch comments for each post |
commentsLimit | integer | 100 | Max comments per post (0 = all, max 1000) |
scrapeImages | boolean | false | Extract and store images |
proxyConfiguration | object | Residential | Apify Proxy settings |
Output Schema
Each dataset item represents one Reddit post (real example):
{"id": "1prkwnx","title": "Product Developer (15y SaaS/Apps) seeking Marketing/Sales co-builder","author": "ManuelWenner","created_utc": "2025-12-20T18:18:21+00:00","permalink": "/r/indiehackers/comments/1prkwnx/product_developer_15y_saasapps_seeking/","url": "https://www.reddit.com/r/indiehackers/comments/1prkwnx/...","selftext": "Hey folks,\n\nI've been building digital products for ~15 years...","score": 2,"upvote_ratio": 0.75,"num_comments": 8,"subreddit": "indiehackers","subreddit_details": {"name": "indiehackers","title": "Independent developers building their own way","description": "IndieHackers is a subreddit focused on people who bootstrap their way to success by building products.","subscribers": 140617,"active_users": null,"created_utc": "2016-09-26T12:05:56+00:00","over_18": false,"subreddit_type": "public","url": "https://www.reddit.com/r/indiehackers/"},"is_nsfw": false,"is_spoiler": false,"link_flair_text": "General Question","comments": [{"id": "nv30qrn","body": "How do people find Matchplan?","author": "scarfwizard","score": 1,"depth": 0,"created_utc": "2025-12-20T20:02:18+00:00","parent_id": "t3_1prkwnx","is_submitter": false},{"id": "nv33n4g","body": "Currently I'm in such an early stage...","author": "ManuelWenner","score": 1,"depth": 1,"parent_id": "t1_nv30qrn","is_submitter": true}],"images": [],"markdown": "# Product Developer seeking Marketing/Sales co-builder\n\n**2 upvotes** | 8 comments | u/ManuelWenner | 2025-12-20\n\nHey folks...\n\n---\n\n## Comments\n\n**[1] u/scarfwizard** How do people find Matchplan?\n> **[1] u/ManuelWenner (OP)** Currently I'm in such an early stage...\n"}
Output Fields
Post data: id, title, author, created_utc, permalink, url, selftext, score, upvote_ratio, num_comments, subreddit, subreddit_details (subscribers, description, etc.), is_nsfw, is_spoiler, link_flair_text
Comments: Flat list with depth (0 = top-level), is_submitter marks OP replies
Markdown: Pre-formatted for LLM consumption with nested blockquotes for replies
Example Inputs
Search Mode (Global)
{"searchKeywords": ["indiehacker", "pain points"],"searchSort": "relevance","searchLimit": 50,"includeComments": true,"commentsLimit": 100}
Subreddit Feed
{"subreddits": ["indiehackers", "SaaS", "startups"],"subredditSort": "top","subredditTimeFilter": "week","subredditLimit": 100}
Direct URLs
{"postUrls": ["https://www.reddit.com/r/indiehackers/comments/abc123/...","https://redd.it/xyz789"],"includeComments": true,"commentsLimit": 500}
Combined (All Modes)
{"searchKeywords": ["product feedback"],"subreddits": ["SaaS"],"postUrls": ["https://redd.it/abc123"],"includeComments": true,"commentsLimit": 100}
Use Cases
For Product Teams
- Voice of Customer - Extract feature requests and complaints from product subreddits
- Competitor Intelligence - Monitor what users say about alternatives
- Product Validation - Search for demand signals before building
For Marketers
- Content Research - Find top-performing topics in your niche
- Lead Generation - Identify users seeking solutions you provide
- Brand Monitoring - Track mentions and sentiment
For Researchers
- Qualitative Analysis - Reddit comments as interview transcripts
- Trend Detection - Early signals from rising posts
- Sentiment Analysis - Community reactions with depth context
Limitations
- Pagination cap: Max 1,000 posts per mode, 1,000 comments per post
- Comment structure: Flat list with
depthfield (not nested JSON tree) - Images: Common sources supported (preview, i.redd.it, imgur, galleries). Videos not downloaded.
- Rate limits: Handled automatically with exponential backoff
FAQ
Q: Do I need a Reddit account or API key? A: No. The Actor extracts publicly available data without authentication.
Q: How fast is it? A: Very fast. No browser overhead means 100 posts with comments typically complete in under 60 seconds.
Q: Why is the markdown field useful? A: LLMs process it directly without transformation. Perfect for n8n/Make workflows where you connect Actor output ā AI node.
Q: Can I scrape private subreddits? A: No. Only public subreddits and posts are accessible.
Q: What if a post is deleted?
A: The Actor returns null and continues with other posts.
Q: How do I reduce costs?
A: Set includeComments: false if you only need posts (3x cheaper: $0.001 vs $0.003). Comment count doesn't affect price, so no need to limit commentsLimit for cost reasonsāuse it only to reduce processing time.
Support
- š§ Email: max@mapa.slmail.me
- š Bugs: Use the Issues tab
- š” Feature requests: Email or Issues
Legal
This Actor extracts publicly available Reddit data. Users are responsible for compliance with Reddit's Terms of Service and applicable data protection regulations (GDPR, CCPA).
š Start Extracting Reddit Data Now
Turn Reddit discussions into AI-ready insights in minutes, not hours.
