Pricing

$19.99/month + usage

Reddit Posts Scraper

Scrape Reddit posts with ease 🧵👽 Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast 🚀

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Scrapium

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Reddit Posts Scraper

Reddit Posts Scraper is a production-ready Apify actor that lets you scrape Reddit posts and comment threads by subreddit, full URL, or search keyword — fast. It solves the hassle of manual copy-paste and unreliable tools by returning clean, structured JSON for analysis. Marketers, developers, data analysts, and researchers use this Reddit scraper tool to scrape Reddit posts at scale for trend tracking, sentiment analysis, and content discovery. With proxy fallback, batching, and structured exports, it enables large-scale Reddit data pipelines and automation.

What data / output can you get?

Below are the exact fields the actor saves to the dataset (one row per post). You can export results to JSON or CSV, or fetch them via the Apify API.

Data field	Description	Example value
subreddit	Community name the post belongs to	"news"
title	Post title	"Example post title"
author	Reddit username of the poster	"username"
score	Upvotes/score of the post	156
num_comments	Total number of comments	42
created_utc	Unix timestamp (UTC)	1703123456.789
permalink	Direct link to the Reddit thread	"https://www.reddit.com/r/news/comments/abc123/..."
body	Selftext/body of the post	"Post content..."
thumbnail_url	Thumbnail image URL	"https://..."
image_url	Main image/media URL (if any)	"https://..."
comments	Nested array of comments with replies	[{"author":"commenter1","body":"Comment text...","score":23,"created_utc":1703123456.789,"replies":[]}]
post_id	Reddit post ID	"abc123"
success	Whether the post was processed successfully	true
error_message	Error detail if processing failed	null

Note: The actor returns structured data for both posts and comments (including nested replies). Fields like author or title may occasionally be "Unknown" or "No Title" if Reddit does not provide them for the post.

Key features

⚡ Parallel comment processing
Fetches comments in parallel for high-throughput scraping — ideal for a Reddit thread scraper capturing full discussions.
🧩 Flexible targeting
Input can be subreddits (e.g., news or r/technology), full Reddit URLs, or search keywords — perfect to scrape subreddit posts or Reddit search results.
🔀 Sort and time filter
Supports sortOrder (hot, new, top, rising) and timeFilter (hour, day, week, month, year, all) for precise Reddit post extractor workflows.
📏 Scalable limits
Control maxPosts per source and maxComments per post to tune depth — great for a Reddit bulk post downloader strategy.
🛡️ Proxy fallback and retries
Automatic fallback from no proxy → datacenter → residential with robust retries for blocks (403/429), 5xx, timeouts, and connection/SSL issues — reliable Reddit crawler behavior.
💾 Live dataset saving
Pushes each item as it’s processed to avoid data loss — supports incremental pipelines and monitoring.
🔌 Developer-friendly outputs
Structured JSON ready for analytics, dashboards, and integrations (use the Apify API from your Python Reddit scraper or apps like Make, n8n, Zapier).

How to use Reddit Posts Scraper - step by step

Sign in to your Apify account at console.apify.com.
Open the actor — search for “Reddit Posts Scraper” in the Store.
Enter your sources in startUrls:
- Subreddit names (e.g., news or r/technology)
- Full URLs (e.g., https://www.reddit.com/r/news/)
- Search keywords (e.g., artificial intelligence)
Configure sorting and time range:
- sortOrder: hot, new, top, rising
- timeFilter: hour, day, week, month, year, all (applies to top/rising)
Set limits and comments depth:
- maxPosts: number of posts per source (1–1000)
- maxComments: number of comments per post (0–1000; 0 skips comments)
Set proxyConfiguration as needed (e.g., enable Apify Proxy). The actor automatically falls back to residential if blocked.
Click Start to run. Watch logs for progress — the actor crawls sources, then processes comments in parallel.
Open the Output tab to view the “Reddit Posts Data” dataset. Export to JSON or CSV, or connect via the Apify API.

Pro Tip: Trigger runs programmatically with the Apify API and pipe results into your analytics stack or automation workflows — a robust alternative to a Reddit posts scraping script or PRAW scrape posts setup.

Use cases

Use case	Description
Market & trend research	Aggregate top posts by keyword or subreddit to quantify discussion volume and surface emerging topics.
NLP / ML datasets	Collect titles, bodies, and nested comment threads to build labeled corpora for sentiment analysis and topic modeling.
Content & SEO	Identify what resonates in your niche, extract quotes, and plan content around high-engagement threads.
Brand monitoring	Track mentions across communities, measure sentiment shifts, and flag high-velocity threads in real time.
Journalism & research	Compile public Reddit discussions and quotes with timestamps and permalinks for verifiable sourcing.
Automation & pipelines	Schedule runs via the Apify API, export JSON/CSV, and sync to BI tools or data lakes as a Reddit API scraper alternative.

Why choose Reddit Posts Scraper?

Built for reliability and scale, this Reddit data scraper balances speed with resilience — without requiring a browser.

✅ Structured and consistent outputs that are analytics-ready (JSON/CSV).
⚙️ High-throughput comment fetching with parallel processing for Reddit thread scraper use cases.
🔐 Automatic proxy fallback and robust retries for blocks, 5xx, and timeouts — production-ready reliability.
🔌 Developer access via the Apify API for integration with Python, ETL tools, and workflow automation.
💡 No browser overhead; efficient HTTP-based collection of public endpoints.
💰 Cost-effective and scalable — suitable for small experiments and larger pipelines alike.
🔄 Better than flaky extensions or manual scripts: stable infrastructure, monitoring, and dataset storage.

In short, it’s a dependable Reddit scraper tool for teams that need consistent, structured Reddit post extraction at scale.

Is it legal / ethical to use Reddit Posts Scraper?

Yes — when done responsibly. This actor is designed for public Reddit content only and does not access private subreddits or authenticated data.

Guidelines for compliant use:

Scrape only publicly available Reddit content and respect community norms.
Review Reddit’s terms and apply reasonable rate limits using proxyConfiguration as needed.
Avoid misuse of personal data in line with applicable regulations (e.g., GDPR, CCPA).
Do not attempt to bypass authentication to access private resources.
Consult your legal team for edge cases or jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
  "startUrls": [
    "https://www.reddit.com/r/news/",
    "news",
    "artificial intelligence"
  ],
  "sortOrder": "top",
  "timeFilter": "week",
  "maxPosts": 50,
  "maxComments": 100,
  "proxyConfiguration": { "useApifyProxy": false }
}

Parameter details:

startUrls (array, required)
Description: Enter one item per line. Mix full URLs (e.g., https://www.reddit.com/r/news/), subreddit names (e.g., news or r/news), or search keywords (e.g., artificial intelligence). Duplicate subreddits are merged.
Default: none (required)
maxPosts (integer)
Description: Max number of posts to scrape per subreddit or keyword (1–1000).
Default: 50
maxComments (integer)
Description: Max comments to fetch for each post (0–1000). Set to 0 to skip comments and only get post metadata.
Default: 100
sortOrder (string)
Description: How Reddit should sort posts — hot (trending), new (latest), top (most upvoted), rising (gaining traction).
Allowed values: "hot", "new", "top", "rising"
Default: "top"
timeFilter (string)
Description: Time range for results. Only applies when sortOrder is top or rising; ignored for hot and new.
Allowed values: "hour", "day", "week", "month", "year", "all"
Default: "week"
proxyConfiguration (object)
Description: Choose which proxies to use. If Reddit blocks a request, the actor automatically falls back: no proxy → datacenter → residential. Recommended for large runs or when you hit blocks.
Default: { "useApifyProxy": false }

Example JSON output item

{
  "subreddit": "news",
  "title": "Example post title",
  "author": "username",
  "score": 156,
  "num_comments": 42,
  "created_utc": 1703123456.789,
  "permalink": "https://www.reddit.com/r/news/comments/abc123/...",
  "body": "Post content...",
  "thumbnail_url": "https://...",
  "image_url": "https://...",
  "comments": [
    {
      "author": "commenter1",
      "body": "Comment text...",
      "score": 23,
      "created_utc": 1703123456.789,
      "replies": []
    }
  ],
  "post_id": "abc123",
  "success": true,
  "error_message": null
}

Notes:

The comments field contains nested replies (recursive structure).
Some fields may be "Unknown" or null if not provided by Reddit for a given post.

FAQ

Is there a free tier to try it?

Yes. You can run small jobs on Apify’s free plan to evaluate the actor before scaling up. Larger workloads may require enabling proxies for reliability.

Does it include comments and replies?

Yes. Set maxComments > 0 to fetch nested comment threads. If you set maxComments to 0, the actor returns only post metadata without comments.

Can I target multiple subreddits or keywords in one run?

Yes. Add as many as you need to startUrls — you can mix subreddit names, full Reddit URLs, and search keywords in the same list.

How does it handle blocks or rate limits?

The actor automatically falls back from no proxy → datacenter → residential and retries on common errors (403/429, 5xx, timeouts, connection/SSL issues).

Which formats can I export to?

You can export the dataset to JSON or CSV from Apify, or access results programmatically via the Apify API.

Can developers integrate this with Python or other workflows?

Yes. Fetch the dataset via the Apify API and plug it into your Python Reddit scraper, data pipelines, or automation tools like Make, n8n, and Zapier.

What types of sources can I input?

You can input subreddits (e.g., news or r/technology), full Reddit URLs, or search keywords to run a Reddit search results scraper workflow.

No. This actor collects public Reddit data efficiently without a browser or login, focusing on structured, reliable output.

Closing CTA / Final thoughts

Reddit Posts Scraper is built for structured extraction of public Reddit posts and comment threads at scale. With flexible inputs, sorting and time filters, scalable limits, proxy fallback, and robust retries, it delivers reliable datasets for analysis and automation. It’s ideal for marketers, developers, data analysts, and researchers who need a dependable Reddit post extractor with JSON/CSV exports and API access. Use the Apify API to wire it into your pipelines or trigger runs from your Python workflows — start extracting smarter Reddit insights today.

Reddit Comment Scraper

scrapium/reddit-comment-scraper

Scrape Reddit comments with ease 💬👽 Extract comment text, usernames, scores, timestamps, replies, and thread details from Reddit posts. Perfect for sentiment analysis, audience research, trend tracking, and community insights. Turn Reddit conversations into actionable data fast 🚀

Scrapium

Reddit Api Scraper

scrapapi/reddit-api-scraper

Reddit API Scraper collects data from Reddit posts, comments, and subreddits using the Reddit API. Extract titles, post text, usernames, scores, timestamps, and comment threads. Ideal for trend analysis, sentiment research, community monitoring, and social data collection.

ScrapAPI

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit Scraper

scraperx/reddit-scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments, authors, flair, upvotes & timestamps from subreddits and threads—fast, real-time & reliable. 📊 Perfect for social listening, market research, trend analysis & sentiment. ⚡ Clean JSON/CSV output. 🚀 API-ready.

ScraperX

Reddit Posts Scraper

scrapemesh/reddit-posts-scraper

🧰 Reddit Posts Scraper extracts Reddit post data by subreddit, keyword, or URL—titles, authors, flairs, scores, upvotes, comments, timestamps, links & media. 📊 Export CSV/JSON. 🔎 Perfect for trend tracking, sentiment analysis, content research & social listening. 🚀

ScrapeMesh

Reddit Scraper

scrapapi/reddit-scraper

Extract posts, comments, and user data from Reddit with the Reddit Scraper. Collect post titles, descriptions, upvotes, comment counts, subreddit names, and author usernames automatically. Ideal for market research, trend discovery, and community analysis.

ScrapAPI

Reddit Posts Scraper

scrapepilotapi/reddit-posts-scraper

ScrapePilot

Reddit Posts Scraper

scrapelabsapi/reddit-posts-scraper

ScrapeLabs

Reddit Posts Scraper

scrapeflow/reddit-posts-scraper

ScrapeFlow

Reddit Api Scraper

scrapio/reddit-api-scraper

Extract structured Reddit data with the Reddit API Scraper. Collect posts, comments, usernames, upvotes, subreddit names, and timestamps directly through the Reddit API. Ideal for market research, sentiment analysis, and community monitoring.