šŸ”„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images avatar
šŸ”„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images

Pricing

Pay per event

Go to Apify Store
šŸ”„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images

šŸ”„ Reddit Scraper Markdown n8n ready | Posts, Comments, Images

Extract Reddit posts and comments as LLM-ready Markdown. No API key needed. Direct n8n/Make integration—connect output to AI nodes instantly. 20x faster than browser scrapers. Perfect for lead gen, product validation, and market research workflows.

Pricing

Pay per event

Rating

0.0

(0)

Developer

ClearPath

ClearPath

Maintained by Community

Actor stats

1

Bookmarked

5

Total users

4

Monthly active users

2 days ago

Last modified

Share

šŸ”„ Reddit Scraper for LLM & RAG | Posts, Comments & Images (2025)

The fastest, most cost-effective way to extract Reddit data for AI workflows. No browser overhead, no rate limit headaches—just clean JSON + LLM-ready Markdown output that plugs directly into n8n, Make, or any automation platform.

  • ⚔ Blazing fast - Pure HTTP requests, no browser simulation
  • šŸ’° Incredibly cheap - $0.30 for 100 posts with comments (flat rate, any comment count)
  • šŸ¤– LLM-optimized - Markdown output ready for GPT, Claude, Gemini
  • šŸ”„ n8n native - Designed for workflow automation

Demo


Why This Actor?

Most Reddit scrapers give you raw JSON that needs heavy transformation before LLMs can use it. This Actor outputs pre-formatted Markdown alongside structured JSON—feed it directly to an AI node without writing a single line of code.

Perfect for n8n Workflows

Lead Generation from Pain Points

reddit-to-llm (search: "looking for", "need help with")
→ LLM qualify leads
→ CRM/Email sequence

Find people actively seeking solutions you provide.

Product Validation Pipeline

Webhook (new idea)
→ reddit-to-llm (search related subreddits)
→ LLM analyze: demand signals, objections, existing solutions
→ Structured report

Before building, validate if people actually want it. The markdown format lets LLMs deeply analyze threaded discussions.


⚔ Key Features

Lightning Fast Extraction

  • No browser overhead - Direct data extraction, not Puppeteer/Playwright
  • 20 concurrent requests - Process multiple posts simultaneously
  • Automatic deduplication - No duplicate posts across modes

šŸŽÆ Three Collection Modes

  • Search - Global or subreddit-restricted keyword search
  • Subreddit Feeds - Hot, new, top, rising posts
  • Direct URLs - Scrape specific posts by URL

šŸ¤– LLM-Ready Output

  • Markdown field - Formatted for direct AI consumption
  • Flat comments with depth - Easy to process, depth signals conviction
  • OP markers - Know when the author replies

šŸ“ø Optional Image Extraction

  • Preview images, galleries, direct links (i.redd.it, imgur)
  • Stored to Apify Key-Value Store with public URLs
  • Ready for vision models (GPT-4V, Claude)

šŸ’° Pricing (Pay Per Event)

Transparent, predictable pricing. Only pay for what you extract.

EventPrice
Post scraped (without comments)$0.001
Post scraped (with comments)$0.003
Image scraped$0.0005

Flat rate per post - Whether a post has 1 comment or 500 comments, the price is the same ($0.003).

Cost Examples

ScenarioPostsComments?ImagesTotal Cost
Posts only100No0$0.10
Posts + comments100Yes (any count)0$0.30
Deep dive500Yes (any count)0$1.50
With images100Yes500$0.55

Cost optimization tips:

  • Set includeComments: false if you only need post titles/content (3x cheaper)
  • Comment count doesn't affect price - get as many as you need!
  • Filter by subreddit to avoid irrelevant posts

Input Configuration

Search Mode

ParameterTypeDefaultDescription
searchKeywordsstring[][]Keywords to search (joined with spaces)
searchInSubredditsstring[][]Limit search to specific subreddits
searchSortenumrelevancerelevance, new, top, comments
searchLimitinteger25Max posts (1-1000)

Subreddit Feed Mode

ParameterTypeDefaultDescription
subredditsstring[]["indiehackers"]Subreddits to scrape
subredditSortenumhothot, new, top, rising
subredditTimeFilterenum-For top: hour, day, week, month, year, all
subredditLimitinteger25Max posts per subreddit (1-1000)

Direct URLs Mode

ParameterTypeDefaultDescription
postUrlsstring[][]Reddit post URLs or redd.it short links

Output Settings

ParameterTypeDefaultDescription
includeCommentsbooleantrueFetch comments for each post
commentsLimitinteger100Max comments per post (0 = all, max 1000)
scrapeImagesbooleanfalseExtract and store images
proxyConfigurationobjectResidentialApify Proxy settings

Output Schema

Each dataset item represents one Reddit post (real example):

{
"id": "1prkwnx",
"title": "Product Developer (15y SaaS/Apps) seeking Marketing/Sales co-builder",
"author": "ManuelWenner",
"created_utc": "2025-12-20T18:18:21+00:00",
"permalink": "/r/indiehackers/comments/1prkwnx/product_developer_15y_saasapps_seeking/",
"url": "https://www.reddit.com/r/indiehackers/comments/1prkwnx/...",
"selftext": "Hey folks,\n\nI've been building digital products for ~15 years...",
"score": 2,
"upvote_ratio": 0.75,
"num_comments": 8,
"subreddit": "indiehackers",
"subreddit_details": {
"name": "indiehackers",
"title": "Independent developers building their own way",
"description": "IndieHackers is a subreddit focused on people who bootstrap their way to success by building products.",
"subscribers": 140617,
"active_users": null,
"created_utc": "2016-09-26T12:05:56+00:00",
"over_18": false,
"subreddit_type": "public",
"url": "https://www.reddit.com/r/indiehackers/"
},
"is_nsfw": false,
"is_spoiler": false,
"link_flair_text": "General Question",
"comments": [
{
"id": "nv30qrn",
"body": "How do people find Matchplan?",
"author": "scarfwizard",
"score": 1,
"depth": 0,
"created_utc": "2025-12-20T20:02:18+00:00",
"parent_id": "t3_1prkwnx",
"is_submitter": false
},
{
"id": "nv33n4g",
"body": "Currently I'm in such an early stage...",
"author": "ManuelWenner",
"score": 1,
"depth": 1,
"parent_id": "t1_nv30qrn",
"is_submitter": true
}
],
"images": [],
"markdown": "# Product Developer seeking Marketing/Sales co-builder\n\n**2 upvotes** | 8 comments | u/ManuelWenner | 2025-12-20\n\nHey folks...\n\n---\n\n## Comments\n\n**[1] u/scarfwizard** How do people find Matchplan?\n> **[1] u/ManuelWenner (OP)** Currently I'm in such an early stage...\n"
}

Output Fields

Post data: id, title, author, created_utc, permalink, url, selftext, score, upvote_ratio, num_comments, subreddit, subreddit_details (subscribers, description, etc.), is_nsfw, is_spoiler, link_flair_text

Comments: Flat list with depth (0 = top-level), is_submitter marks OP replies

Markdown: Pre-formatted for LLM consumption with nested blockquotes for replies


Example Inputs

Search Mode (Global)

{
"searchKeywords": ["indiehacker", "pain points"],
"searchSort": "relevance",
"searchLimit": 50,
"includeComments": true,
"commentsLimit": 100
}

Subreddit Feed

{
"subreddits": ["indiehackers", "SaaS", "startups"],
"subredditSort": "top",
"subredditTimeFilter": "week",
"subredditLimit": 100
}

Direct URLs

{
"postUrls": [
"https://www.reddit.com/r/indiehackers/comments/abc123/...",
"https://redd.it/xyz789"
],
"includeComments": true,
"commentsLimit": 500
}

Combined (All Modes)

{
"searchKeywords": ["product feedback"],
"subreddits": ["SaaS"],
"postUrls": ["https://redd.it/abc123"],
"includeComments": true,
"commentsLimit": 100
}

Use Cases

For Product Teams

  • Voice of Customer - Extract feature requests and complaints from product subreddits
  • Competitor Intelligence - Monitor what users say about alternatives
  • Product Validation - Search for demand signals before building

For Marketers

  • Content Research - Find top-performing topics in your niche
  • Lead Generation - Identify users seeking solutions you provide
  • Brand Monitoring - Track mentions and sentiment

For Researchers

  • Qualitative Analysis - Reddit comments as interview transcripts
  • Trend Detection - Early signals from rising posts
  • Sentiment Analysis - Community reactions with depth context

Limitations

  • Pagination cap: Max 1,000 posts per mode, 1,000 comments per post
  • Comment structure: Flat list with depth field (not nested JSON tree)
  • Images: Common sources supported (preview, i.redd.it, imgur, galleries). Videos not downloaded.
  • Rate limits: Handled automatically with exponential backoff

FAQ

Q: Do I need a Reddit account or API key? A: No. The Actor extracts publicly available data without authentication.

Q: How fast is it? A: Very fast. No browser overhead means 100 posts with comments typically complete in under 60 seconds.

Q: Why is the markdown field useful? A: LLMs process it directly without transformation. Perfect for n8n/Make workflows where you connect Actor output → AI node.

Q: Can I scrape private subreddits? A: No. Only public subreddits and posts are accessible.

Q: What if a post is deleted? A: The Actor returns null and continues with other posts.

Q: How do I reduce costs? A: Set includeComments: false if you only need posts (3x cheaper: $0.001 vs $0.003). Comment count doesn't affect price, so no need to limit commentsLimit for cost reasons—use it only to reduce processing time.


Support

  • šŸ“§ Email: max@mapa.slmail.me
  • šŸ› Bugs: Use the Issues tab
  • šŸ’” Feature requests: Email or Issues

This Actor extracts publicly available Reddit data. Users are responsible for compliance with Reddit's Terms of Service and applicable data protection regulations (GDPR, CCPA).


šŸš€ Start Extracting Reddit Data Now

Turn Reddit discussions into AI-ready insights in minutes, not hours.