Reddit Intelligence Scraper avatar

Reddit Intelligence Scraper

Under maintenance

Pricing

from $2.00 / 1,000 results

Go to Apify Store
Reddit Intelligence Scraper

Reddit Intelligence Scraper

Under maintenance

Collect public Reddit posts, comments, communities, and user profile data from searches, subreddit pages, Reddit URLs, and usernames. Export clean datasets for monitoring, research, and AI workflows.

Pricing

from $2.00 / 1,000 results

Rating

0.0

(0)

Developer

Muhammad Qaseem Iqbal

Muhammad Qaseem Iqbal

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

12 hours ago

Last modified

Categories

Share

🚀 Reddit Intelligence Scraper

Collect public Reddit posts, comments, communities, and user profile data from searches, subreddit pages, Reddit URLs, and usernames. 🔎 Use it to monitor conversations, research customer opinions, follow trends, and export clean Reddit data into spreadsheets, dashboards, databases, AI workflows, or automation tools. 📊

This Actor is designed to be practical for both non-technical users and data teams. ✅ You can start with a keyword or Reddit URL, choose how many results you want, and download the results from the Apify dataset when the run finishes. 📥

🧠 What does this Actor do?

Reddit Intelligence Scraper turns public Reddit pages into structured data. 🧾 Instead of manually copying posts and comments from Reddit, you can run the Actor and get organized records with useful details such as:

  • 📝 post title, body, author, subreddit, score, comment count, and URL
  • 💬 comment text, author, parent post, score, depth, and timestamp
  • 🏘️ subreddit/community name, description, subscriber count, and metadata
  • 👤 public user profile information, including karma and profile URL
  • 🏷️ optional sentiment labels, content categories, engagement metrics, media links, and raw payloads

No Reddit API key, OAuth setup, or Reddit login is required for supported public pages. 🔓

🎯 Common use cases

  • 📣 Track brand, product, or competitor mentions on Reddit
  • 📅 Monitor subreddit discussions on a schedule
  • 💡 Find customer pain points, feature requests, complaints, and praise
  • 🧵 Collect comments from a specific Reddit thread
  • 🔬 Research topics, communities, trends, and market language
  • 🤖 Build datasets for AI search, RAG, clustering, dashboards, or reports
  • 📤 Export Reddit data to CSV, Excel, Google Sheets, Make, Zapier, n8n, webhooks, or your own API workflow

📦 What Reddit data can it collect?

Data typeWhat you can collect
📝 PostsSearch results, subreddit listings, direct post URLs, user submitted posts, r/all, and r/popular
💬 CommentsComment search results and comment threads under posts when comment collection is enabled
🏘️ CommunitiesSubreddit metadata and community search results
👤 UsersPublic Reddit user profile records and optional user activity inputs

The Actor works with several input styles, so you can start broad with keywords or stay precise with direct Reddit URLs. 🧭

⚡ How to scrape Reddit on Apify

  1. 🖥️ Open the Actor in Apify Console.
  2. ➕ Add at least one source:
    • 🔎 keywords in Search terms
    • 🔗 Reddit links in Direct Reddit URLs
    • 🏘️ subreddit names or URLs in Full subreddit scrape inputs
    • 👤 Reddit usernames or profile URLs in User profile inputs
  3. 🎚️ Set a result limit, such as maxItems.
  4. ⚙️ Choose whether to include comments, media links, sentiment, or other optional data.
  5. ▶️ Click Start.
  6. 📥 Download the results from the Dataset tab as JSON, CSV, Excel, XML, or RSS.

For a quick test, use a small limit such as maxItems: 10. 🧪 For scheduled monitoring, keep the limit modest and run the Actor repeatedly. 📅

🎛️ Input options

You only need one valid source to start. ✅ The most important fields are below.

FieldPlain-English meaningTypical use
🔎 searchTermsKeywords or phrases to search across RedditBrand monitoring, topic research, competitor tracking
🔗 startUrlsDirect Reddit URLsScrape a specific post, subreddit, user page, or Reddit search URL
🏘️ subredditUrlsSubreddit names or URLsCollect posts from communities such as r/startups
👤 userUrlsReddit usernames or profile URLsCollect public user profile information
🎚️ maxItemsMaximum total records to saveKeep tests and production runs under control
💬 crawlCommentsPerPostAlso collect comments under each collected postThread research, sentiment, FAQ mining
🧵 maxCommentsPerPostComment limit for each postPrevent very large threads from growing too much
🧭 sort and timeReddit search ranking and time windowNewest posts, top posts this week, most commented posts, etc.
📍 withinCommunitySearch only inside one subredditSearch for a topic within a specific community
🖼️ includeMediaLinksSave image, video, gallery, and outbound link detailsMedia analysis or content discovery
😊 sentimentAnalysisAdd simple sentiment labels to posts and commentsPositive, negative, neutral, mixed, or uncertain
🏷️ contentAnalysisAdd topic/category labels to post recordsRouting, grouping, research, and AI workflows
🛡️ proxyConfigurationOptional Apify Proxy settingsUse Residential proxy when Reddit blocks cloud traffic

Advanced settings are available for date filters, comment depth, strict keyword matching, output style, raw data storage, and run reports. 🧰

🧪 Example inputs

Use this when you want a small sample of recent posts for a topic. ⚡

{
"searchTerms": ["AI video generator"],
"sort": "new",
"time": "week",
"maxItems": 25,
"maxPostsPerSearch": 25
}

📣 2. Brand and competitor monitoring

Use this to track mentions and include comments found through Reddit comment search. 📡

{
"searchTerms": ["Acme AI", "Acme pricing", "Acme alternative"],
"searchPosts": true,
"searchComments": true,
"sort": "new",
"time": "week",
"maxItems": 150,
"maxPostsPerSearch": 50,
"maxCommentsCount": 50,
"sentimentAnalysis": true
}

🏘️ 3. Scrape a subreddit

Use this to collect posts from one or more communities. 🧭

{
"subredditUrls": ["r/startups"],
"subredditSort": "new",
"subredditTime": "month",
"maxItems": 100,
"maxPostsPerSubreddit": 100
}

🧵 4. Collect a full post thread

Use this when you already know the Reddit post URL and want the discussion under it. 💬

{
"startUrls": [
{
"url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/"
}
],
"crawlCommentsPerPost": true,
"maxCommentsPerPost": 500,
"commentDepthLimit": 0
}

💸 5. Low-cost test run

Use this before a larger run to confirm your input works. ✅

{
"searchTerms": ["customer support software"],
"maxItems": 10,
"maxPostsPerSearch": 10,
"crawlCommentsPerPost": false,
"includeMediaLinks": false,
"saveRawData": false,
"writeHtmlReport": false
}

📤 Output

Results are saved to the default Apify dataset. 📊 Each dataset item is one record.

Possible record types:

  • 📝 post
  • 💬 comment
  • 🏘️ community
  • 👤 user

Every record includes basic tracking fields such as: 🧾

FieldMeaning
🧩 kindType of record: post, comment, community, or user
🆔 idReddit item ID
🔗 urlMain Reddit URL for the item
canonicalUrlNormalized Reddit URL where available
⏱️ scrapedAtWhen the Actor collected the record
📍 sourceWhich input produced the record
🔁 sourcesOther inputs that found the same record, when duplicates are merged

📝 Example post output

{
"kind": "post",
"id": "1hvoazn",
"url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"title": "My best cheesecake so far",
"author": "example_user",
"subreddit": "Baking",
"createdAt": "2025-01-07T10:09:56.000Z",
"score": 3489,
"numComments": 43,
"mediaType": "gallery",
"hasMedia": true,
"sentimentLabel": "positive",
"contentCategoryLabel": "Food & Drink"
}

The exact fields depend on the record type and the options you enable. ⚙️

📋 Run summary

At the end of a run, the Actor writes RUN-SUMMARY.json to the key-value store. 🧾 This file is useful when you want a quick overview without opening the full dataset.

The summary includes:

  • 🔢 total records saved
  • 📦 records by type
  • 🔎 query and subreddit breakdowns
  • ⏭️ skipped items and why they were skipped
  • 📈 request statistics
  • ⚠️ warnings and errors
  • 🆔 IDs of the output dataset and key-value store

If you enable writeHtmlReport, the Actor can also create a simple HTML report called RUN-MAP.html. 🗺️

💸 Cost and performance tips

This Actor is configured to keep costs low by default. ✅

  • 🛡️ Residential proxy is enabled by default because Reddit currently blocks direct Apify cloud traffic.
  • 🏠 For the cheapest successful tests, keep runs small and use direct Reddit URLs first.
  • 🎚️ Result limits are conservative by default.
  • 🔁 Request retries are disabled by default to avoid paying for repeated failed requests.
  • 📁 Raw data, media details, awards, and HTML reports are off by default.
  • 💬 Comments are only collected when you enable comment collection.

To keep runs cheap:

  • 🧪 start with maxItems between 10 and 100
  • 💬 keep crawlCommentsPerPost off unless you need thread-level discussion
  • 📦 keep saveRawData off unless you are debugging
  • 🗺️ keep writeHtmlReport off unless you need a visual report
  • 🔭 avoid maximizeCoverage unless recall matters more than speed and cost
  • 🛡️ disable proxy only if direct access works for your run environment

💳 Store pricing

This Actor is designed for simple pay-per-result pricing on Apify Store. 🧾

Recommended paid events:

EventWhat it means
🚀 apify-actor-startA very small startup event charged automatically by Apify
📦 apify-default-dataset-itemOne saved dataset record, such as a post, comment, community, or user

This keeps pricing easy to predict: the more records you save, the more you pay. Apify shows the run cost before and during execution, and you can control spend by setting maxItems, comment limits, and other result caps. 🎚️

📅 Scheduling and integrations

You can schedule this Actor in Apify Console to monitor Reddit regularly. ⏰ For example:

  • ⚡ every hour for fast-moving brand monitoring
  • 📆 once per day for subreddit tracking
  • 📊 once per week for market research exports

After each run, you can send the dataset to:

  • 📗 Google Sheets
  • 🧩 Make
  • ⚡ Zapier
  • 🔄 n8n
  • 🪝 webhooks
  • ☁️ cloud storage
  • 🗄️ databases and warehouses
  • 🔌 custom applications through the Apify API

⚠️ Important notes and limitations

Reddit controls how much public data is available through its pages and listings. 📌 This affects all Reddit scrapers, not only this Actor.

  • 🔒 Some private, restricted, quarantined, deleted, removed, or login-gated content cannot be collected.
  • 🪟 Reddit search and subreddit listings may expose only a limited window of results.
  • 🕰️ Very old posts may require narrower keywords, different sort options, or direct URLs.
  • 🚧 Reddit may rate limit or block traffic from cloud networks or proxies.
  • ❌ If every Reddit request is blocked, the Actor fails the run instead of silently returning an empty successful dataset.
  • ⚙️ This version is HTTP-first and does not use a browser fallback.

If a run is blocked by Reddit, try a smaller run first, reduce concurrency and request rate, try a direct post URL, use different inputs, or run again later. 🧪 Residential proxy settings are often the most reliable cloud option for Reddit, but they can increase cost and are not guaranteed to bypass every Reddit-side block. 🛡️

❓ FAQ

Scraping public Reddit data can be allowed in many cases, but you are responsible for how you collect, store, and use the data. 🛡️ Always follow Reddit's terms, applicable laws, privacy rules, and the rules of any downstream platform where you use the data.

🔑 Do I need a Reddit account or API key?

No. ✅ This Actor is built for supported public Reddit pages and does not require a Reddit login or Reddit API key.

💬 Can it scrape comments?

Yes. ✅ Enable crawlCommentsPerPost to collect comments under posts. You can control the amount with maxCommentsPerPost and commentDepthLimit.

🔗 Can I scrape a specific Reddit post?

Yes. ✅ Add the post URL to startUrls. If you also want the comments, enable crawlCommentsPerPost.

🏘️ Can I scrape a whole subreddit?

Yes. ✅ Add a subreddit name such as r/startups or a full subreddit URL to subredditUrls. You can choose sorting options such as new, hot, top, rising, or most commented.

📉 Why did I get fewer results than expected?

Common reasons include Reddit result limits, strict filters, date filters, duplicate removal, deleted or unavailable items, or Reddit blocking the request. 🔍 Check RUN-SUMMARY.json for warnings, errors, and skip counts.

Reddit lists are not unlimited. 📌 Search pages and subreddit feeds often stop after a practical result window. To find more unique posts, try narrower keywords, different time windows, different sort options, or direct Reddit URLs.

🛡️ Do I need proxies?

On Apify cloud, usually yes. 🛡️ Reddit is currently blocking direct cloud requests in our tests, while the RESIDENTIAL proxy group succeeded. Residential proxy traffic can increase cost, so keep test runs small and lower maxItems while testing.

📤 Can I export the results?

Yes. ✅ Apify datasets can be exported as JSON, CSV, Excel, XML, RSS, or accessed through the Apify API.

🤖 Can I use the data with AI tools?

Yes. ✅ The output is structured JSON, which makes it suitable for AI search, summarization, clustering, dashboards, and RAG workflows. Make sure your use of the data follows applicable privacy and platform rules.

🛡️ Responsible use

Use this Actor only for public Reddit data that you are allowed to collect and process. ✅ Do not use it to collect private, login-gated, sensitive, or harmful personal data. 🔒 Avoid publishing datasets in a way that exposes individuals unfairly or outside the purpose for which the data was collected.

🧰 Support

If something does not work as expected, include:

  • 🆔 the Apify run ID
  • 📥 your input JSON
  • 📋 the RUN-SUMMARY.json file
  • 📝 a short description of what you expected and what happened

This makes it much easier to diagnose blocked requests, empty datasets, input mistakes, and result-limit questions. 🔍