®️ Reddit Posts Intelligence Scraper
Pricing
from $5.00 / 1,000 results
®️ Reddit Posts Intelligence Scraper
Posts-only Reddit scraper using public .json endpoints. Extracts posts and adds lead intent, sentiment, virality, quality, keyword matches, and RAG markdown.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
Ian Dikhtiar
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 days ago
Last modified
Categories
Share
Reddit Lead Intel Scraper
Reddit is where buyers complain before they book demos.
People ask for alternatives, rant about broken tools, compare competitors, describe painful workflows, and reveal exactly what they want next. The problem is that Reddit is noisy as hell.
This actor finds the useful posts and ranks them.
It scrapes public Reddit posts through Reddit's .json endpoints, then enriches every post with lead intent, urgency, sentiment, virality, quality, keyword matches, and RAG-ready markdown.
No Reddit OAuth. No official Reddit API key. No browser crawling.
What this is really for
Use it when you want to find posts like:
- “What’s the best alternative to Apollo?”
- “I need a tool that can enrich leads without getting blocked.”
- “Has anyone found a cheaper way to monitor brand mentions?”
- “Our CRM is a mess — what should we switch to?”
- “Looking for software that handles outbound and compliance.”
Those are not just posts. They are demand signals.
Who uses it
- Founders looking for early customers, competitor gaps, and raw market pain
- Growth teams monitoring Reddit for high-intent conversations
- Sales teams finding people actively asking for recommendations
- Market researchers collecting voice-of-customer data without manually scrolling Reddit
- Content teams discovering topics people actually care about
- AI teams building clean Reddit datasets for RAG, classification, and analysis
What you get in each row
| Category | Fields | Why it matters |
|---|---|---|
| Reddit post | title, text, author, subreddit, permalink, timestamp | The actual post and source context |
| Engagement | score, upvote ratio, comments count | Shows whether the post has traction |
| Lead signals | lead intent score, urgency, matched keywords, signal explanations | Tells you which posts deserve attention first |
| Text signals | sentiment, quality score, spam/noise penalties | Helps separate useful pain from garbage |
| AI-ready text | RAG markdown with metadata | Ready for LLMs, embeddings, alerts, or CRM enrichment |
The intelligence layer
Lead intent score
Every post gets a lead_intent_score from 0 to 100.
The score rises when a post looks commercially useful: recommendation requests, alternative searches, buying language, pain/problem language, strong keyword matches, and meaningful engagement.
High scores usually mean: “someone should look at this.”
Lead urgency
Each post is labeled low, medium, or high urgency.
This makes it easy to route the best posts into Slack, a spreadsheet, a CRM, a lead review queue, or an LLM workflow.
Signal explanations
The actor does not just score posts silently. It tells you why a post was interesting.
Example signals:
- buying/recommendation intent
- pain/problem language
- fast engagement velocity
- negative sentiment risk
- possible spam/low-quality content
Sentiment and quality
Reddit contains gold and garbage in the same thread.
The actor scores sentiment and quality so you can find useful complaints, product feedback, and competitor frustration without drowning in memes, spam, and low-effort posts.
Virality velocity
A post with five comments in ten minutes can matter more than a post with fifty comments from last year.
Virality velocity helps surface discussions that are moving now.
RAG-ready markdown
Every row includes rag_markdown, a clean markdown document containing the post title, body, subreddit, author, and source URL.
Use it for:
- vector databases
- LLM summarization
- lead qualification
- category research
- alerts and dashboards
- downstream enrichment
Example: find competitor alternatives
{"queries": ["Apollo alternative", "best lead generation tool", "need sales intelligence software"],"subreddits": ["SaaS", "sales", "Entrepreneur"],"sort": "relevance","time": "year","maxResults": 100,"keywords": ["recommend", "alternative", "looking for", "need", "tool", "software"],"negativeKeywords": ["crypto", "casino", "airdrop", "giveaway"],"dropNegativeKeywordMatches": true,"excludeOver18": true}
Example output
{"type": "post","title": "Evaluating B2B lead generation tool - compliance friendly, enterprise ready","author": "Additional-Pop8840","subreddit": "Entrepreneur","score": 2,"num_comments": 21,"permalink": "https://www.reddit.com/r/Entrepreneur/comments/...","intelligence": {"lead_intent_score": 100,"lead_urgency": "high","sentiment_label": "neutral","virality_velocity_per_hour": 0.049,"quality_score": 100,"matched_keywords": ["tool"],"signals": ["buying/recommendation intent", "pain/problem language"]}}
Good search ideas
Try phrases that sound like real Reddit posts:
| Goal | Search examples |
|---|---|
| Find alternatives | competitor alternative, switching from competitor, best alternative to competitor |
| Find pain | struggling with CRM, outbound is not working, lead data problem |
| Find buyers | need a tool for, looking for software, what should I use for |
| Find feedback | is product worth it, has anyone tried product, product review |
| Find trends | AI tool for sales, automated prospecting, brand monitoring reddit |
Input guide
| Input | Best use |
|---|---|
queries | Search Reddit by buyer phrases, competitor names, pain points, or product categories |
subreddits | Focus on communities where your buyers hang out |
startUrls | Scrape specific Reddit URLs directly |
keywords | Boost lead scoring for your preferred intent phrases |
negativeKeywords | Penalize or remove noisy topics |
minLeadIntentScore | Save only stronger leads; try 60+ or 80+ |
maxResults | Control output size |
Use at least one of queries, subreddits, or startUrls.
Data source
The actor uses public Reddit .json endpoints. It does not require Reddit OAuth or an official Reddit API key.
Reddit may throttle or block some cloud traffic, so Apify Proxy is enabled by default. The actor also uses delays, retries, pagination controls, and graceful partial results.
What this actor does not do
- It does not crawl full comment trees.
- It does not access private Reddit content.
- It does not recover deleted, gated, quarantined, or unavailable posts.
- It does not pretend heuristic scoring is perfect qualification.
This is intentionally posts-only. A dedicated comments actor is a cleaner, separate product.