Reddit Scraper avatar

Reddit Scraper

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Reddit Scraper

Reddit Scraper

Reddit scraper that extracts posts, comments, communities, and user profiles from any subreddit or search query, so marketers and researchers can collect structured Reddit data without API keys or login.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

Kawsar

Kawsar

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Categories

Share

Reddit Scraper: Extract Posts, Comments, Communities, and Users

Provide Reddit URLs to scrape directly, or use the search fields to find content by keyword.

Reddit Scraper is a powerful Apify actor that pulls data from Reddit's public API endpoints without needing an account or API key. Feed it a list of Start URLs (subreddits, post links, user profiles) or provide Search Terms to find relevant content instantly. Results are returned as structured data ready for analysis, research, or integration.

Use cases

  • Market research: track what Reddit communities say about your product, brand, or competitors by scraping keyword searches on demand
  • Content discovery: monitor top posts across subreddits to spot trending topics before they go mainstream
  • Sentiment analysis: collect post and comment data for NLP pipelines to measure audience sentiment over time
  • Academic research: gather Reddit posts and replies for social media studies without hitting API rate limits
  • Lead generation: find active users in niche communities and extract profile information for outreach campaigns
  • Competitive intelligence: scrape competitor mentions across subreddits to understand how the community perceives them

What data does this actor extract?

Results are stored in a dataset. Each item has a dataType field (post, comment, community, or user) so you can filter by content type after the run.

Example Reddit post

{
"id": "t3_144w7sn",
"parsedId": "144w7sn",
"url": "https://www.reddit.com/r/learnprogramming/comments/144w7sn/",
"username": "kawsarlog",
"itemTitle": "Best resources for learning web scraping in 2025",
"communityName": "r/learnprogramming",
"parsedCommunityName": "learnprogramming",
"body": "I have been learning Python for 6 months...",
"numberOfComments": 42,
"upVotes": 318,
"isVideo": false,
"isAd": false,
"over18": false,
"createdAt": "2024-06-09T05:23:15.000Z",
"scrapedAt": "2025-03-14T08:00:00.000Z",
"dataType": "post"
}

Example Reddit comment

{
"id": "t1_jnhqrgg",
"parsedId": "jnhqrgg",
"url": "https://www.reddit.com/r/learnprogramming/comments/144w7sn/.../jnhqrgg/",
"parentId": "t3_144w7sn",
"username": "dev_user",
"category": "learnprogramming",
"communityName": "r/learnprogramming",
"body": "httpx and BeautifulSoup are a great combo for static sites.",
"upVotes": 27,
"numberOfReplies": 3,
"createdAt": "2024-06-09T06:14:00.000Z",
"scrapedAt": "2025-03-14T08:00:01.000Z",
"dataType": "comment"
}

Example Reddit community

{
"id": "2fwo",
"name": "t5_2fwo",
"itemTitle": "Learn Programming",
"url": "https://www.reddit.com/r/learnprogramming/",
"communityName": "r/learnprogramming",
"communityDescription": "A subreddit for all questions related to programming in any language.",
"numberOfMembers": 5800000,
"over18": false,
"createdAt": "2010-01-25T00:00:00.000Z",
"scrapedAt": "2025-03-14T08:00:02.000Z",
"dataType": "community"
}

Example Reddit user

{
"id": "c3h2qmv",
"url": "https://www.reddit.com/user/kawsarlog/",
"username": "kawsarlog",
"postKarma": 1204,
"commentKarma": 876,
"userDescription": "Building things on the web.",
"over18": false,
"createdAt": "2020-04-10T15:13:39.000Z",
"scrapedAt": "2025-03-14T08:00:03.000Z",
"dataType": "user"
}

Input

ParameterTypeDefaultDescription
startUrlsarrayReddit URLs to scrape: subreddits, posts, user profiles, or search pages.
searchesarrayKeywords to search Reddit for. Returns matching posts, communities, and users.
scrapeTypestringpostsWhat data to extract from the provided URLs or keywords: posts, comments, communities, users.
searchCommunityNamestringRestrict keyword search to a specific subreddit (name only, without r/).
sortstringhotSort order: hot, new, top, relevance, comments.
timestringallTime filter for Top or search: all, hour, day, week, month, year.
includeNSFWbooleanfalseInclude NSFW content in results.
skipCommentsbooleanfalseSkip comment extraction when scraping posts. Useful when you only need post data.
maxItemsinteger100Maximum number of items to collect per run (hard cap: 1000).
timeoutSecsinteger300Overall actor timeout in seconds.
requestTimeoutSecsinteger30Per-request HTTP timeout in seconds.
proxyConfigurationobjectDatacenter (Anywhere)Proxy settings for requests. Supports Datacenter, Residential, Special, and custom proxies.

How to scrape Reddit by URL

Pass any Reddit URL to startUrls. The actor detects the page type automatically.

Supported URL formats:

  • Subreddit: https://www.reddit.com/r/programming/
  • Subreddit with sort: https://www.reddit.com/r/programming/top/
  • Post with comments: https://www.reddit.com/r/learnprogramming/comments/144w7sn/
  • User profile: https://www.reddit.com/user/spez/
  • Search results: https://www.reddit.com/search/?q=data+scraping

How to search Reddit by keyword

Add one or more terms to searches. Each keyword triggers a Reddit search and returns posts, matching communities, and user profiles. Use searchCommunityName to restrict results to a single subreddit.

Example input

{
"startUrls": ["https://www.reddit.com/r/datascience/"],
"searches": ["web scraping python"],
"sort": "top",
"time": "month",
"maxItems": 200,
"skipComments": false,
"includeNSFW": false,
"proxyConfiguration": { "useApifyProxy": true }
}

Output fields

FieldTypePresent onDescription
idstringallFull Reddit ID with type prefix (e.g. t3_abc).
parsedIdstringallShort Reddit ID without prefix.
urlstringallPermalink to the item on Reddit.
usernamestringpost, comment, userReddit username of the author.
dataTypestringallItem type: post, comment, community, or user.
itemTitlestringpost, communityPost title or subreddit display name.
communityNamestringpost, comment, communitySubreddit name with r/ prefix.
parsedCommunityNamestringpost, comment, communitySubreddit name without r/ prefix.
bodystringpost, commentPost text or comment text. Contains URLs for media posts.
numberOfCommentsintegerpostNumber of comments on the post.
upVotesintegerpost, commentScore (upvotes minus downvotes).
isVideobooleanpostTrue if the post has a video attachment.
isAdbooleanpostTrue if the post is a promoted advertisement.
over18booleanallTrue if the item is marked NSFW.
parentIdstringcommentReddit ID of the parent post or comment.
categorystringcommentSubreddit name the comment belongs to.
numberOfRepliesintegercommentNumber of direct replies.
namestringcommunityFull Reddit internal name (e.g. t5_2fwo).
headerImagestringcommunityURL of the community header or icon image.
communityDescriptionstringcommunityPublic subreddit description.
numberOfMembersintegercommunityTotal subscriber count.
userIconstringuserURL of the user's profile avatar.
postKarmaintegeruserTotal post karma.
commentKarmaintegeruserTotal comment karma.
userDescriptionstringuserUser bio text.
createdAtstringallISO 8601 timestamp of when the item was created on Reddit.
scrapedAtstringallISO 8601 timestamp of when this actor collected the item.

How it works

  1. The actor reads your startUrls list and searches keywords from the input.
  2. For each URL, it detects the page type (subreddit, post, user, or search) and calls the appropriate Reddit public JSON endpoint.
  3. For each search keyword, it searches Reddit posts, communities, and users via the search API.
  4. Results are parsed into structured records and pushed to the Apify dataset in real time.
  5. The run stops when maxItems or timeoutSecs is reached, whichever comes first.

FAQ

Do I need a Reddit account or API key to use this actor? No. Reddit Scraper uses Reddit's public JSON API, which works without authentication. No credentials are required.

Is it legal to scrape Reddit? Scraping publicly available data is generally legal for personal use, research, and analysis. Review Reddit's terms of service for commercial use cases. Always respect rate limits and avoid collecting personal data beyond what is publicly visible.

Why am I getting fewer results than my maxItems setting? Reddit's API paginates results. If a subreddit or search has fewer posts than your maxItems limit, the actor returns all available items and stops. Try increasing the time filter or using multiple start URLs to gather more data.

Can I export scraped Reddit data to CSV or Google Sheets? Yes. After a run, you can download the dataset in JSON, CSV, XML, or Excel from the Apify console. You can also connect directly to Google Sheets using the Apify Google Sheets integration.

How do I scrape Reddit comments only? Point startUrls at a specific post URL. Comments are included by default. Set skipComments to true if you only want the post itself.

Do I need proxies to scrape Reddit? For most use cases, datacenter proxies work fine. If Reddit throttles your requests, switch to Residential proxies in the proxy picker. The default proxyConfiguration uses Apify Datacenter proxies automatically.

Integrations

Connect Reddit Scraper with other apps and services using Apify integrations. You can push results directly to Google Sheets, trigger runs from Zapier or Make, send notifications to Slack, or sync data to Airbyte. Use webhooks to fire actions the moment results are ready.

Reddit Scraper is a solid tool for anyone who wants structured Reddit data without dealing with the official API rate limits or authentication setup.