Reddit Post Harvester
Pricing
from $1.00 / 1,000 results
Reddit Post Harvester
Scrape posts from any subreddit without authentication. Fetches titles, scores, authors, flairs, thumbnails and URLs via RSS + JSON API. Supports hot/new/top/rising sorting, time filters, and proxy rotation to bypass Reddit blocks.
Pricing
from $1.00 / 1,000 results
Rating
0.0
(0)
Developer
Saregaa
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
2
Monthly active users
5 days ago
Last modified
Categories
Share
Reddit RSS Scraper | Extract Subreddit Posts, Scores & Metadata Without an API Key
Extract posts from any public subreddit without an API key, OAuth tokens, or a browser. Built for marketers, researchers, data engineers, and developers who need fresh Reddit post data on a schedule — without managing API credentials.
The actor pulls post titles, URLs, scores, authors, flairs, comment counts, and timestamps from any combination of subreddits using Reddit's public RSS and JSON feeds, with Chrome TLS fingerprint impersonation to avoid blocks.
✅ No Reddit API key or OAuth required
✅ Scrape multiple subreddits in a single run
✅ 4 sort modes: hot, new, top, rising
✅ Pagination beyond the 25-post RSS limit via JSON cursor
✅ Export results as JSON, CSV, or Excel
✅ Full API and scheduling support via Apify
What Data Can Be Extracted?
| Field | Type | Description |
|---|---|---|
id | string | Reddit post ID (e.g.1dxyz42) |
title | string | Full post title |
permalink | string | Full Reddit URL to the post |
url | string | External URL for link posts; Reddit URL for self posts |
subreddit | string | Subreddit name |
author | string | Author username |
score | integer | Net upvotes at scrape time |
upvote_ratio | float | Upvote ratio 0.0–1.0 (JSON pages only) |
num_comments | integer | Total comment count (JSON pages only) |
flair | string | Post flair label, or null |
post_type | string | self(text post) or link |
thumbnail | string | Thumbnail URL, or null |
created_at | string | Post creation time (ISO 8601 UTC) |
scraped_at | string | Scrape time (ISO 8601 UTC) |
Note:
upvote_ratioandnum_commentsare only available from the JSON API (page 2+). The first 25 posts fetched via RSS will havenullfor these fields. SetmaxPostsPerSubreddit> 25 to backfill them for all posts.
Features
- No credentials needed — accesses only public, anonymous Reddit feeds
- Multi-subreddit — scrape dozens of subreddits in one run
- 4 sort modes —
hot,new,top,rising - Pagination — goes beyond the RSS 25-post limit via JSON cursor (up to ~1,000 posts per subreddit)
- Time filter — restrict
topposts tohour,day,week,month,year, orall - TLS fingerprint spoofing —
curl_cffiimpersonates Chrome 120, bypassing Reddit's fingerprint-based blocks - Residential proxy support — plug in Apify Proxy for high-volume runs
- Export to JSON, CSV, or Excel — download directly from the Apify Output tab
- Schedule runs — automate hourly, daily, or weekly collection
- API access — integrate with Zapier, Make, n8n, or your own pipeline
How to Scrape Reddit Data — Step by Step
- Open the actor in Apify Console
- Enter one or more subreddit names (e.g.
MachineLearning,LocalLLaMA). Ther/prefix is optional. - Choose a sort order (
hot,new,top,rising), set the max posts per subreddit, and optionally set a time filter fortop - Click Start
- Download results as JSON, CSV, or Excel from the Output tab, or access them via the Apify API
Input Example
{"subreddits": ["technology", "MachineLearning", "LocalLLaMA"],"sort": "top","maxPostsPerSubreddit": 50,"timeFilter": "week","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Input Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
subreddits | string[] | ["technology"] | Subreddit names — with or without r/prefix |
sort | string | hot | Sort order:hot,new,top,rising |
maxPostsPerSubreddit | integer | 25 | Posts to collect per subreddit (1–100) |
timeFilter | string | day | Time range for topsort:hour,day,week,month,year,all |
proxyConfiguration | object | — | Apify Proxy config. Residential recommended for high-volume runs |
Output Example
Each record in the dataset represents one Reddit post:
{"id": "1dxyz42","title": "New open-source model beats GPT-4 on coding benchmarks","permalink": "https://www.reddit.com/r/MachineLearning/comments/1dxyz42/new_open_source_model/","url": "https://arxiv.org/abs/2406.12345","subreddit": "MachineLearning","author": "ml_researcher","score": 4821,"upvote_ratio": 0.96,"num_comments": 312,"flair": "Research","post_type": "link","thumbnail": "https://b.thumbs.redditmedia.com/abc123.jpg","created_at": "2026-06-09T10:34:21+00:00","scraped_at": "2026-06-09T11:00:03+00:00"}
Use Cases
Trend Monitoring
Track what's gaining traction in your niche in real time. Schedule the actor to run hourly on subreddits like entrepreneur, startups, or SaaS to power a trend dashboard or Slack alert.
{"subreddits": ["entrepreneur", "startups", "SaaS"],"sort": "hot","maxPostsPerSubreddit": 25}
Weekly Top Posts Digest
Pull the best content from multiple communities for a newsletter or internal report.
{"subreddits": ["MachineLearning", "LocalLLaMA", "datascience"],"sort": "top","timeFilter": "week","maxPostsPerSubreddit": 100}
NLP Training Data Collection
Collect high-quality community text at scale. Filter by score >= 500 post-processing to keep only community-validated content.
{"subreddits": ["AskReddit", "explainlikeimfive", "changemyview"],"sort": "top","timeFilter": "year","maxPostsPerSubreddit": 100}
Competitor & Community Research
Monitor conversations in competitor product subreddits. Track sentiment, common complaints, and feature requests over time without manual browsing.
Content Ideation
Identify top-performing post titles and topics in your niche. Use the data to inform blog posts, video ideas, or social media content calendars.
Academic & Social Research
Gather timestamped post data for studying online community behavior, topic evolution, or information spread over time.
API Access & Automation
All results are accessible via the Apify API. Trigger runs, poll for results, and stream dataset items into your own pipeline.
curl -X POST \"https://api.apify.com/v2/acts/YOUR_USERNAME~reddit-rss-scraper/runs?token=<YOUR_TOKEN>" \-H "Content-Type: application/json" \-d '{"subreddits": ["python"],"sort": "hot","maxPostsPerSubreddit": 25}'
Or use the Python SDK:
from apify_client import ApifyClientclient = ApifyClient("<YOUR_APIFY_TOKEN>")run = client.actor("YOUR_USERNAME/reddit-rss-scraper").call(run_input={"subreddits": ["MachineLearning", "LocalLLaMA"],"sort": "top","maxPostsPerSubreddit": 50,"timeFilter": "week"})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["title"], "|", item["score"])
Results integrate natively with Zapier , Make , and n8n for no-code automation.
Pricing
This actor runs on Apify's pay-per-use infrastructure. Costs depend on compute time and the number of requests made.
| Volume | Subreddits | Posts Each | Estimated Cost |
|---|---|---|---|
| Small run | 3 | 25 | < $0.05 |
| Medium run | 10 | 50 | ~$0.10–$0.20 |
| Large run | 20 | 100 | ~$0.30–$0.60 |
| With residential proxy | Any | Any | Add proxy usage costs |
For most runs under 200 posts across a few subreddits, no proxy is needed — the TLS fingerprint spoofing handles standard loads without additional cost.
Why Use This Instead of the Reddit Official API?
| Feature | Reddit RSS Scraper | Reddit Official API |
|---|---|---|
| API key required | ❌ No | ✅ Yes — OAuth app registration |
| Setup time | ~2 minutes | 15–30 minutes |
| Rate limits | Generous (curl_cffi) | 60 req/min (free tier) |
| Pagination | Up to ~1,000 posts | Up to ~1,000 posts |
| Export formats | JSON, CSV, Excel | JSON only |
| Scheduling | Built-in via Apify | Manual implementation |
| Proxy support | Apify Residential built-in | Not applicable |
FAQ
Is scraping Reddit legal?
This actor accesses only publicly available Reddit content visible to any anonymous visitor. It does not bypass authentication, CAPTCHAs, or access private data. Web scraping of public data is generally permitted, as affirmed by the hiQ Labs v. LinkedIn ruling. This tool is not affiliated with, endorsed by, or sponsored by Reddit Inc.
Does this actor require proxies?
For most runs (200 posts or fewer across a few subreddits), the actor works without any proxy thanks to curl_cffi Chrome TLS impersonation. For high-volume or scheduled runs, Apify Residential proxies are recommended to avoid 403 blocks.
Can I schedule runs?
Yes. Apify has built-in scheduling. You can set the actor to run hourly, daily, weekly, or on any cron schedule from the Apify Console.
Can I export results to CSV or Excel?
Yes. Once a run completes, download results as JSON, CSV, or Excel directly from the Output tab in Apify Console.
How many posts can I scrape per subreddit?
Up to approximately 1,000 posts. Reddit's unauthenticated JSON API stops paginating after around 1,000 items. The first 25 posts are fetched via RSS; subsequent pages use the JSON API with cursor pagination.
Does it work on all subreddits?
It works on any public subreddit. Private or restricted subreddits that require a logged-in account are not accessible.
What sort modes are supported?
hot, new, top, and rising. For top, you can also set a time filter: hour, day, week, month, year, or all.
Do upvote_ratio and num_comments get populated for all posts?
These fields are only available from the JSON API, not from RSS. The first 25 posts will have null for these fields unless maxPostsPerSubreddit is set above 25, which triggers JSON pagination and backfills them.
What happens if Reddit changes its feed format?
Open an issue on the Issues tab. The actor is maintained and will be updated to reflect structural changes.
Can I use this through the API without Apify Console?
Yes. The actor exposes a full REST API. You can trigger runs, poll status, and fetch dataset items programmatically using the Apify REST API or Python SDK.
How to Scrape Reddit Data Without an API Key
Reddit's official API requires OAuth registration, credential management, and enforces strict rate limits. This actor uses Reddit's public RSS and JSON feeds instead — available to any anonymous visitor. By combining curl_cffi Chrome TLS impersonation with cursor-based JSON pagination, it reliably collects up to 1,000 posts per subreddit without any API key setup.
Reddit API Alternative for Bulk Data Collection
If you need Reddit post data for research, monitoring, or data pipelines, the official Reddit API is often overkill. This actor provides a simpler alternative: paste in subreddit names, click Start, and get structured data ready to download or query via API.
How to Export Reddit Data to CSV
After a run completes in Apify, open the Output tab and click Download as CSV or Excel . No additional tooling required. For API-driven workflows, you can stream results as JSONL or paginate through the dataset endpoint.
Automate Reddit Data Collection
Use Apify's built-in scheduler to run this actor on any cron schedule — hourly trend monitoring, daily digests, or weekly research pulls. Results can be forwarded automatically to Google Sheets, Slack, Airtable, or any webhook via Apify's integrations with Zapier, Make, and n8n.
Support
Found a bug or need a feature? Open an issue on the Issues tab in Apify Console. Feedback and pull requests are welcome.