Pricing

$19.99/month + usage

Reddit Scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments & metadata from subreddits, users and threads — keywords, timestamps, scores & links. 📤 Export JSON/CSV. 🚀 Ideal for market research, social listening, academic studies & content discovery.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Scrapium

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Reddit Scraper

Reddit Scraper is an Apify actor that extracts Reddit posts, comment threads, subreddit listings/metadata, and user profiles using Reddit’s public JSON API — no login required. It solves the pain of manually collecting Reddit data by automating discovery and extraction at scale. As a Reddit API scraper and Reddit web scraper built in Python, it’s ideal for marketers, developers, data analysts, and researchers who need to scrape Reddit posts, scrape subreddit posts, or scrape Reddit comments for social listening, market research, and academic studies. Run it once or schedule it to power always-on Reddit data extraction pipelines. 🚀

What data / output can you get?

Here are the key fields the actor returns in the dataset. Values below reflect real field names and structures used by the actor.

Data type	Description	Example value
metadata.timestamp	ISO UTC timestamp of the run summary item	2026-04-12T13:07:24Z
metadata.config.maxItemsToSave	Global limit applied to certain queries	10
data.search.	Count of posts matching a search term	25
data.search.	Post title from search results	“Best Python tips for data viz”
data.subreddit.	Whether a subreddit listing item is a self post	true
data.posts.	Post upvote ratio	0.92
data.posts.	Nested replies array for each comment	[ { … }, … ]
data.users.	Sum of link and comment karma	15342
data.users.	Link to a user’s submitted post	https://reddit.com/r/python/comments/abc123/...
data.users.	Truncated comment body (200 chars)	“I’d recommend using aiohttp with backoff…”
data.users.	Count of items in overview (if fetched)	40
data.subreddit_info.popular[].subscribers	Subscriber count for popular subreddits	845321

Notes:

The actor pushes streaming items for each section (e.g., per search term, per subreddit, per post, per user, subreddit info batches), and finally pushes a single aggregated “summary” object with metadata and a nested data object. You can export results to JSON or CSV directly from the Apify dataset.

Key features

🌐 No-login Reddit data extraction
- Uses Reddit’s public JSON API to collect public data; perfect for a Reddit scraper Python workflow without cookies or auth.
🔗 Bulk Start URLs (posts, subreddits, users)
- Paste multiple Reddit URLs to scrape Reddit posts, scrape subreddit posts by sort, and collect user profiles with submitted posts and comments.
🔎 Flexible search with sort & time filters
- Run keyword searches across Reddit or within a specific community, with sort (relevance/new/hot/top/comments) and time filter (hour/day/week/month/year/all).
🧵 Post + full comment tree (depth control)
- Capture a post and its nested comments, with configurable max depth to tailor “Reddit comment extractor” needs.
👤 User profiles, submissions, and comments
- Fetch profile karma and history with per-user item limits; optionally add a concise overview summary.
🏘️ Subreddit listings & metadata
- Collect hot/new/top/rising/controversial listings and enrich with subreddit_info: popular, new, and specific communities’ details.
🧰 Limits & performance controls
- Cap items per run, per page, and comments per page; set delays between requests; constrain comment tree depth for predictable output sizes.
🔄 Smart proxy fallback for reliability
- Starts direct; on blocks automatically falls back to datacenter, then residential proxies (with retries) to keep your Reddit crawler running smoothly.
💾 Developer friendly outputs
- Clean JSON structures ready for pipelines, dashboards, or ML — ideal for Reddit data extraction and Reddit sentiment analysis scraper use cases.
📤 Easy export
- Export datasets to JSON/CSV via the Apify platform for downstream analytics and automation.

How to use Reddit Scraper - step by step

Sign in to Apify Console.
Open the Reddit Scraper actor (reddit-scraper) from your Actors.
Add input:
- Start URLs: paste Reddit post URLs (/comments/…), subreddit URLs (/r/…), or user URLs (/user/…).
- Or provide Search Term(s) and enable “Ignore start URLs” to run search-only jobs.
Configure key settings:
- Sort and time filters for search.
- Limits: maximum items to save, posts/comments per page, max comment depth, and request delay.
- Toggles: Enable/disable subreddit posts, post+comments, user scraper, subreddit info.
- Proxy settings: leave off to start direct; the actor auto-falls back to datacenter → residential if blocked.
Start the run.
- Watch progress in the Log. The actor streams items to the dataset as it completes each section.
Review results in the Dataset.
- You’ll see per-section items (e.g., type: “search”, “subreddit”, “post”, “user”, “subreddit_info”) and a final aggregated object with metadata and nested “data”.
Export your data.
- Download JSON/CSV from the Dataset tab or use the Apify API in your Reddit API scraper pipeline.

Pro tip: Schedule recurring runs for ongoing monitoring, or connect the Apify Dataset API to your Reddit scraper Python workflows and data warehouses.

Use cases

Use case name	Description
Market research & trend tracking	Analyze topics, upvotes, and comment volume to quantify interest and sentiment in your niche.
Social listening for brands	Monitor subreddit posts and comment threads to surface complaints, ideas, and feature requests.
Academic & policy research	Collect reproducible datasets of posts and comments for studies, leveraging a structured Reddit data extractor.
Competitive intelligence	Track competitor communities and product feedback across relevant subreddits.
Content strategy & SEO	Discover high-performing threads and questions to inform content calendars and keyword targeting.
Customer support insights	Harvest user pain points from comments for faster triage and product improvements.
Data engineering pipelines	Build automated Reddit crawler pipelines using Apify’s API and export JSON/CSV into your analytics stack.

Why choose Reddit Scraper?

This production-ready Reddit scraping tool emphasizes precision, automation, and reliability.

✅ Accurate JSON from public endpoints: Clean, structured fields straight from Reddit’s public JSON API.
🌍 Scalable and flexible: Bulk Start URLs and keyword searches with configurable limits and depths.
🧑‍💻 Developer-ready: Structured outputs fit data pipelines; perfect for Reddit scraper Python integration and API-driven workflows.
🔐 Safe by design: No login required; collects public data only.
🔄 Resilient infrastructure: Automatic proxy fallback (direct → datacenter → residential) keeps runs stable under rate limiting.
💸 Cost-effective alternative: Avoid brittle browser extensions and manual copy-paste with a robust, server-side Reddit web scraper.

Bottom line: It’s a reliable Reddit post scraper and Reddit data extraction tool that balances control, scale, and clean outputs.

Is it legal / ethical to use Reddit Scraper?

Yes — when used responsibly. This actor accesses publicly available Reddit data without authentication and does not target private content. You should:

Scrape only public data.
Respect Reddit’s terms and rate limits.
Observe applicable data protection laws (e.g., GDPR, CCPA) and internal policies.
Avoid misuse (e.g., unsolicited outreach/spam). Always verify compliance with your legal team for your specific use case.

Input parameters & output format

Example JSON input

{
  "startUrls": [
    "https://www.reddit.com/r/python/",
    "https://www.reddit.com/r/dataisbeautiful/",
    "https://www.reddit.com/r/Python/comments/abc123/example_post_title/",
    "https://www.reddit.com/user/spez/"
  ],
  "searchTerms": ["data visualization", "asyncio"],
  "searchCommunity": "python",
  "ignoreStartUrls": false,
  "sortSearch": "new",
  "timeFilter": "all",
  "enableSearch": true,
  "enableSubreddit": true,
  "sortSubreddit": "hot",
  "enablePost": true,
  "enableUser": true,
  "enableSubredditInfo": true,
  "skipComments": false,
  "skipUserPosts": false,
  "skipCommunity": false,
  "maxItemsToSave": 10,
  "limitPostsPerPage": 10,
  "limitCommentsPerPage": 10,
  "limitCommunityPages": 2,
  "limitUserPages": 2,
  "pageScrollTimeout": 40,
  "maxCommentDepth": 5,
  "maxItemsPerUser": 20,
  "fetchUserProfile": true,
  "fetchUserSubmitted": true,
  "fetchUserComments": true,
  "fetchUserOverview": false,
  "fetchPopularSubreddits": false,
  "fetchNewSubreddits": false,
  "maxSubredditsInfo": 25,
  "requestDelaySeconds": 1,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

All input fields

Field	Type	Required	Default	Description
startUrls	array	Yes	—	One or more Reddit URLs to scrape. Supports bulk. Accepts post (/comments/…), subreddit (/r/…), and user (/user/…) URLs.
skipComments	boolean	No	false	Do not fetch comments for any post URLs; only post-level data is collected.
skipUserPosts	boolean	No	false	Ignore user profile URLs; no user data is scraped.
skipCommunity	boolean	No	false	Do not fetch subreddit/community metadata.
searchTerms	array	No	—	One or more search phrases. Used in search mode (no Start URLs or Ignore start URLs on).
searchCommunity	string	No	—	Optional subreddit name (without r/). Restricts search to a specific community.
ignoreStartUrls	boolean	No	false	Skip all Start URLs and run search-only mode (requires at least one Search Term).
searchForPosts	boolean	No	true	Include posts in search results.
searchForComments	boolean	No	false	Reserved for future use: include comments in the search scope.
searchForCommunities	boolean	No	false	Reserved for future use: include communities in the search scope.
searchForUsers	boolean	No	false	Reserved for future use: include users in the search scope.
sortSearch	string	No	new	Order search results: relevance, new, hot, top, comments.
timeFilter	string	No	all	Search time window: hour, day, week, month, year, all.
filterByDate	string	No	—	Optional absolute or relative filter for posts (e.g., 2024-01-15 or “2 weeks”).
enableSearch	boolean	No	true	Run Reddit search when in search mode and terms are provided.
enableSubreddit	boolean	No	true	Process subreddit URLs to fetch listings by sort.
sortSubreddit	string	No	hot	Subreddit listing sort: hot, new, top, rising, controversial.
enablePost	boolean	No	true	Process post URLs to fetch full post and comment tree (unless Skip comments is on).
enableUser	boolean	No	true	Process user URLs to fetch profile, submitted posts, and comments.
enableSubredditInfo	boolean	No	true	Fetch subreddit metadata for popular/new/specific communities.
maxItemsToSave	integer	No	10	Global maximum number of items to collect across sources.
limitPostsPerPage	integer	No	10	Maximum posts to fetch from a single subreddit listing page.
postDateLimit	string	No	—	Only include posts created on or after this date (YYYY-MM-DD).
limitCommentsPerPage	integer	No	10	Maximum number of comments retrieved per post.
limitCommunityPages	integer	No	2	Maximum listing pages to paginate for each subreddit URL.
limitUserPages	integer	No	2	Maximum pages to fetch for user submissions and comments.
pageScrollTimeout	integer	No	40	Timeout in seconds for page/scroll operations.
maxCommentDepth	integer	No	5	Maximum nested comment depth to parse (1–20).
maxItemsPerUser	integer	No	20	Max submitted posts and comments per user profile.
fetchUserProfile	boolean	No	true	Fetch user profile details (name, karma, created).
fetchUserSubmitted	boolean	No	true	Fetch posts submitted by each user.
fetchUserComments	boolean	No	true	Fetch comments made by each user.
fetchUserOverview	boolean	No	false	Fetch combined user overview and add summary counts.
fetchPopularSubreddits	boolean	No	false	Fetch Reddit’s popular subreddit list.
fetchNewSubreddits	boolean	No	false	Fetch recently created subreddits list.
maxSubredditsInfo	integer	No	25	Maximum number of subreddits for popular/new lists.
requestDelaySeconds	integer	No	1	Delay between HTTP requests to Reddit in seconds.
proxyConfiguration	object	No	{ "useApifyProxy": false }	Configure Apify Proxy. Actor auto-falls back to datacenter → residential on blocks if not set.

Output format

During the run, the actor streams items to the dataset by section and then pushes a final aggregated object. Examples:

Search item

{
  "type": "search",
  "query": "data visualization",
  "community": "python",
  "total": 3,
  "posts": [
    {
      "title": "Matplotlib vs Plotly for dashboards",
      "subreddit": "Python",
      "author": "chart_ninja",
      "score": 128,
      "num_comments": 42,
      "url": "https://example.com/post-url",
      "permalink": "https://reddit.com/r/Python/comments/abc123/...",
      "created_utc": 1712875200
    }
  ]
}

Subreddit listing item

{
  "type": "subreddit",
  "source": "python",
  "total": 2,
  "posts": [
    {
      "title": "Asyncio tips for web scraping",
      "author": "py_async",
      "score": 210,
      "num_comments": 33,
      "url": "https://example.com/post-url",
      "permalink": "https://reddit.com/r/Python/comments/def456/...",
      "created_utc": 1712878800,
      "is_self": true,
      "selftext": "Here are some tips..."
    }
  ]
}

Post with comments item

{
  "type": "post",
  "postId": "abc123",
  "post": {
    "id": "abc123",
    "title": "Show HN: My Python scraper",
    "author": "dev_user",
    "subreddit": "Python",
    "score": 512,
    "upvote_ratio": 0.95,
    "num_comments": 87,
    "created_utc": 1712871000,
    "url": "https://example.com/post-url",
    "permalink": "https://reddit.com/r/Python/comments/abc123/...",
    "is_self": true,
    "selftext": "I built a scraper...",
    "link_flair_text": "Project",
    "over_18": false
  },
  "comments": [
    {
      "id": "c1",
      "author": "commenter1",
      "body": "Nice work!",
      "score": 15,
      "created_utc": 1712874600,
      "depth": 0,
      "replies": []
    }
  ],
  "total_comments_parsed": 1
}

User item

{
  "type": "user",
  "username": "spez",
  "profile": {
    "name": "spez",
    "link_karma": 1000,
    "comment_karma": 500,
    "total_karma": 1500,
    "created_utc": 1133212800
  },
  "submitted": [
    {
      "title": "Announcement",
      "subreddit": "announcements",
      "score": 999,
      "created_utc": 1712870000,
      "permalink": "https://reddit.com/r/announcements/comments/ghi789/..."
    }
  ],
  "comments": [
    {
      "body": "Thanks for the feedback.",
      "subreddit": "redditdev",
      "score": 42,
      "created_utc": 1712873600,
      "permalink": "https://reddit.com/r/redditdev/comments/jkl012/..."
    }
  ],
  "overview": null
}

Subreddit info items

{
  "type": "subreddit_info",
  "kind": "popular",
  "count": 2,
  "subreddits": [
    {
      "name": "python",
      "title": "Python",
      "description": "News about the programming language Python",
      "subscribers": 1000000,
      "active_users": 8500,
      "created_utc": 1200000000,
      "over_18": false,
      "subreddit_type": "public",
      "url": "https://reddit.com/r/python/",
      "icon_img": "",
      "banner_img": "",
      "community_icon": ""
    }
  ]
}

Final aggregated object (summary)

{
  "metadata": {
    "timestamp": "2026-04-12T13:07:24Z",
    "config": {
      "maxItemsToSave": 10,
      "limitPostsPerPage": 10,
      "maxCommentDepth": 5
    }
  },
  "data": {
    "search": {
      "data visualization": {
        "total": 3,
        "posts": [
          {
            "title": "Matplotlib vs Plotly for dashboards",
            "subreddit": "Python",
            "author": "chart_ninja",
            "score": 128,
            "num_comments": 42,
            "url": "https://example.com/post-url",
            "permalink": "https://reddit.com/r/Python/comments/abc123/...",
            "created_utc": 1712875200
          }
        ]
      }
    },
    "subreddit": {
      "python": {
        "total": 2,
        "posts": [
          {
            "title": "Asyncio tips for web scraping",
            "author": "py_async",
            "score": 210,
            "num_comments": 33,
            "url": "https://example.com/post-url",
            "permalink": "https://reddit.com/r/Python/comments/def456/...",
            "created_utc": 1712878800,
            "is_self": true,
            "selftext": "Here are some tips..."
          }
        ]
      }
    },
    "posts": {
      "abc123": {
        "post": { "id": "abc123", "title": "Show HN: My Python scraper", "author": "dev_user", "subreddit": "Python", "score": 512, "upvote_ratio": 0.95, "num_comments": 87, "created_utc": 1712871000, "url": "https://example.com/post-url", "permalink": "https://reddit.com/r/Python/comments/abc123/...", "is_self": true, "selftext": "I built a scraper...", "link_flair_text": "Project", "over_18": false },
        "comments": [ { "id": "c1", "author": "commenter1", "body": "Nice work!", "score": 15, "created_utc": 1712874600, "depth": 0, "replies": [] } ],
        "total_comments_parsed": 1
      }
    },
    "users": {
      "spez": {
        "profile": { "name": "spez", "link_karma": 1000, "comment_karma": 500, "total_karma": 1500, "created_utc": 1133212800 },
        "submitted": [ { "title": "Announcement", "subreddit": "announcements", "score": 999, "created_utc": 1712870000, "permalink": "https://reddit.com/r/announcements/comments/ghi789/..." } ],
        "comments": [ { "body": "Thanks for the feedback.", "subreddit": "redditdev", "score": 42, "created_utc": 1712873600, "permalink": "https://reddit.com/r/redditdev/comments/jkl012/..." } ],
        "overview": { "total_items": 0, "posts_count": 0, "comments_count": 0 }
      }
    },
    "subreddit_info": {
      "popular": [
        { "name": "python", "title": "Python", "description": "News about the programming language Python", "subscribers": 1000000, "active_users": 8500, "created_utc": 1200000000, "over_18": false, "subreddit_type": "public", "url": "https://reddit.com/r/python/", "icon_img": "", "banner_img": "", "community_icon": "" }
      ],
      "new": [],
      "specific": {
        "python": { "name": "python", "title": "Python", "description": "News about the programming language Python", "subscribers": 1000000, "active_users": 8500, "created_utc": 1200000000, "over_18": false, "subreddit_type": "public", "url": "https://reddit.com/r/python/", "icon_img": "", "banner_img": "", "community_icon": "" }
      }
    }
  }
}

Notes:

Depending on your toggles, some optional objects may be missing or null (e.g., user.profile when disabled; overview when off).
Export JSON or CSV from the Dataset UI or via the Apify API.

FAQ

Do I need to log in or use API keys to run this Reddit Scraper?

No. The actor uses Reddit’s public JSON API and does not require login or API keys. It’s a Reddit scraper Python solution that works on publicly available data only.

Can it scrape Reddit comments as well as posts?

Yes. When “Enable post + comments” is on and “Skip comments” is off, it fetches full posts and parses comment trees up to the “Max comment tree depth” you set.

How do I restrict search to a specific subreddit?

Provide your Search Term(s) and set “Community (optional)” to the subreddit name (without r/). Enable “Ignore start URLs” if you want search-only mode.

What limits control the volume of data?

Use “Maximum number of items to be saved,” “Limit of posts scraped inside a single page,” “Limit of comments scraped inside a single page,” “Max items per user,” and “Max comment tree depth.” You can also adjust “Delay between requests (seconds)” to manage pacing and reduce rate limiting.

How reliable is it when Reddit rate-limits or blocks requests?

The actor starts with direct connections and automatically falls back to Apify datacenter proxy, then residential proxy (with retries) if blocked. Once a working proxy is found, it’s used for all remaining requests.

What output formats are supported?

Data is stored in the Apify Dataset. You can export to JSON or CSV and consume via the Apify API for integration into your Reddit API scraper pipelines.

Can I integrate this with my own Python scripts or data pipelines?

Yes. Access the Dataset via the Apify API and process outputs in your scripts. The structured JSON is designed for easy ingestion in automation, analytics, or ML workflows.

Is it safe and compliant to use?

Yes — when used responsibly on public content. The actor does not access private or authenticated data. Always respect Reddit’s terms and applicable data protection laws.

Final thoughts

Reddit Scraper is built for scalable, structured Reddit data extraction without login. With bulk URL support, flexible search, full comment parsing, user activity capture, and resilient proxy fallback, it’s ideal for marketers, developers, data analysts, and researchers. Export clean JSON/CSV, connect via the Apify API, and automate your Reddit web scraper workflows with confidence. Start extracting smarter insights from Reddit today.

Reddit Scraper

scraperx/reddit-scraper

🔎 Reddit Scraper (reddit-scraper) extracts posts, comments, authors, flair, upvotes & timestamps from subreddits and threads—fast, real-time & reliable. 📊 Perfect for social listening, market research, trend analysis & sentiment. ⚡ Clean JSON/CSV output. 🚀 API-ready.

ScraperX

Reddit Api Scraper

scrapapi/reddit-api-scraper

Reddit API Scraper collects data from Reddit posts, comments, and subreddits using the Reddit API. Extract titles, post text, usernames, scores, timestamps, and comment threads. Ideal for trend analysis, sentiment research, community monitoring, and social data collection.

ScrapAPI

Reddit Posts Scraper

scrapium/reddit-posts-scraper

Scrape Reddit posts with ease 🧵👽 Extract titles, post text, subreddits, usernames, upvotes, comments, timestamps, and links from Reddit threads. Perfect for trend tracking, sentiment analysis, audience research, and content discovery. Turn Reddit data into actionable insights fast 🚀

Scrapium

Reddit Scraper

janbruinier/jan-reddit-scraper

Scrape posts and comments from Reddit

Jan Bruinier

Reddit Posts Scraper

scrapemesh/reddit-posts-scraper

🧰 Reddit Posts Scraper extracts Reddit post data by subreddit, keyword, or URL—titles, authors, flairs, scores, upvotes, comments, timestamps, links & media. 📊 Export CSV/JSON. 🔎 Perfect for trend tracking, sentiment analysis, content research & social listening. 🚀

ScrapeMesh

Reddit Scraper

alex_claw/reddit-scraper

Alex Claw

Reddit Scraper

khaki_yak/reddit-scraper

AI Automation

Reddit Scraper

scraperforge/reddit-scraper

ScraperForge

Reddit Scraper

scrapeflow/reddit-scraper

ScrapeFlow

Reddit Scraper

scrapebase/reddit-scraper

ScrapeBase

Reddit Scraper

Reddit Scraper

What data / output can you get?

Key features

How to use Reddit Scraper - step by step

Use cases

Why choose Reddit Scraper?

Is it legal / ethical to use Reddit Scraper?

Input parameters & output format

Example JSON input

All input fields

Output format

FAQ

Do I need to log in or use API keys to run this Reddit Scraper?

Can it scrape Reddit comments as well as posts?

How do I restrict search to a specific subreddit?

What limits control the volume of data?

How reliable is it when Reddit rate-limits or blocks requests?

What output formats are supported?

Can I integrate this with my own Python scripts or data pipelines?

Is it safe and compliant to use?

Final thoughts

You might also like

Reddit Scraper

Reddit Api Scraper

Reddit Posts Scraper

Reddit Scraper

Reddit Posts Scraper

Reddit Scraper

Reddit Scraper

Reddit Scraper

Reddit Scraper

Reddit Scraper