Reddit Niche Subreddit Scraper | Auto-Tagged | Free
Pricing
Pay per usage
Reddit Niche Subreddit Scraper | Auto-Tagged | Free
Scrape posts from any list of niche subreddits with automatic keyword tagging. Filter by date, score, comments. Output: clean JSON ready for LLM training, social listening, or brand monitoring. FREE during launch preview.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Polara Data
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Reddit Niche Subreddit Scraper (Auto-Tagged)
Scrape posts from a curated list of niche subreddits, with optional keyword search and automatic content tagging. Built for ML/LLM training pipelines, social listening, brand monitoring, and trend detection on niche communities that generic scrapers miss.
What it does
- Pulls posts from any list of subreddits (no auth, no API key)
- Filters by sort order (hot/new/top/rising), time window, min upvotes, min comments
- Optional within-subreddit keyword search
- Auto-tags every post with your custom keyword list — search the body+title for terms you care about, output them as a
tagsarray - Returns clean structured JSON, ready to drop into ML pipelines or Slack/Notion automations
Use cases
LLM training data — Curate subreddit-specific corpora for fine-tuning domain models (e.g. r/MachineLearning + r/LocalLLaMA + r/datascience for AI dev models).
Social listening (niche) — Track brand mentions or competitor names across vertical subreddits without paying enterprise tools.
Trend detection — Auto-tag posts in r/startups, r/SaaS, r/Entrepreneur for emerging product categories or pain points.
Content discovery — Find high-engagement posts (>100 score, >50 comments) in your niche for content marketing inspiration.
Input
{"subreddits": ["MachineLearning", "datascience", "LocalLLaMA"],"sort": "hot","searchQuery": "RAG","tagKeywords": ["RAG", "fine-tuning", "Llama", "evaluation", "agent", "embedding"],"maxPostsPerSubreddit": 25,"minScore": 5,"minComments": 0,"includeBody": true}
| Field | Type | Default | Description |
|---|---|---|---|
subreddits | array | required | Subreddit names (without /r/) |
sort | enum | hot | hot / new / top / rising |
timeFilter | enum | week | hour / day / week / month / year / all (only for sort=top) |
searchQuery | string | (none) | Optional keyword search inside each subreddit |
tagKeywords | array | [] | Auto-tag keywords applied to title+body |
maxPostsPerSubreddit | int (1-500) | 25 | Cap per subreddit |
minScore | int | 5 | Skip posts below this upvote count |
minComments | int | 0 | Skip posts below this comment count |
includeBody | bool | true | Include selftext body in output |
Output
One dataset item per post:
{"id": "1abcxyz","subreddit": "MachineLearning","title": "[D] Best practices for evaluating RAG systems in production","body": "...","author": "user123","url": "https://www.reddit.com/r/MachineLearning/comments/1abcxyz/...","linkUrl": "https://arxiv.org/abs/...","score": 234,"upvoteRatio": 0.97,"numComments": 56,"createdUtc": 1730000000,"createdAt": "2026-04-29T10:00:00Z","isSelf": true,"flair": "Discussion","domain": "self.MachineLearning","tags": ["RAG", "evaluation"]}
Pricing
Currently FREE during the launch preview — no per-result charges, no monthly cap.
When paid pricing rolls out (notice will be posted at least 14 days in advance):
| Event | Price |
|---|---|
| Actor start | $0.01 (one-time per run) |
| Result item | $0.001 (per post) |
Cost examples (post-launch):
- 100 posts: ~$0.11
- 1.000 posts: ~$1.01
- 10.000 posts: ~$10.01
Limits
- Source: Reddit public JSON API (no auth required, no API key)
- Rate limit: ~1 req/sec (politely paced internally with 0.6s sleep)
- Max posts per subreddit: 500 per run (cumulative pagination)
- No private subreddits, no NSFW filtering bypass
- No comment scraping in v1 (planned for v2)
Source attribution
Data comes from Reddit's public JSON endpoint (/r/{sub}/.json), which does not require authentication. Subject to Reddit's Public Content Policy.
Author
Polara Data — niche scrapers for Italy, EU & global markets.