Reddit Scraper | All-In-One | $1.5 / 1K avatar

Reddit Scraper | All-In-One | $1.5 / 1K

Pricing

$1.49 / 1,000 results

Go to Apify Store
Reddit Scraper | All-In-One | $1.5 / 1K

Reddit Scraper | All-In-One | $1.5 / 1K

Extract Reddit posts and full comment threads from searches, subreddits, user pages, and direct post URLs. Built for enterprise-grade speed, richest-in-class data coverage, advanced filtering, and clean JSON for market intelligence, sentiment analysis, and analytics.

Pricing

$1.49 / 1,000 results

Rating

3.9

(9)

Developer

Fatih Tahta

Fatih Tahta

Maintained by Community

Actor stats

83

Bookmarked

2.1K

Total users

345

Monthly active users

13 hours

Issues response

2 days ago

Last modified

Share

Reddit Scraper Enterprise Grade

Slug: fatihtahta/reddit-scraper-search-fast

Overview

Reddit Scraper collects publicly available Reddit posts and, when enabled, comment records with the metadata teams usually need for analysis, enrichment, and automation. Each record can include core content, authorship, engagement metrics, subreddit context, timestamps, canonical URLs, and media-related fields when available. Reddit is one of the web's richest sources of real-world opinions, product feedback, community discussion, and trend signals, which makes it valuable for research, monitoring, and downstream decision-making. This actor turns that information into consistent dataset records so teams can replace repetitive manual collection with repeatable runs. It is designed for production use with stable identifiers, overlap-friendly outputs for inter-seed deduplication, and reliable scheduled collection workflows.

Why Use This Actor

  • Market research and analytics teams: Track conversation volume, engagement, subreddit activity, and topic trends across keywords, communities, and time windows.
  • Product and content teams: Discover user pain points, feature requests, language patterns, and high-performing discussion themes for roadmap and editorial planning.
  • Developers and data engineering teams: Feed Reddit data into ETL pipelines, warehouses, dashboards, and APIs using structured JSON records that are easy to upsert and model.
  • Lead generation and enrichment teams: Identify relevant communities, active discussions, and context around buyer interests, brand mentions, and niche topics.
  • Monitoring and competitive intelligence teams: Watch competitor mentions, category shifts, launch reactions, and recurring discussion spikes without manually checking Reddit every day.
  • Operations and automation teams: Run recurring jobs on a schedule and use stable record keys to deduplicate overlapping results from queries, subreddit searches, and direct URLs.

Input Parameters

Provide any combination of URLs, queries, and filters to control what the actor collects and how focused the results should be.

ParameterTypeDescriptionDefault
queriesstring[]Search phrases to look for across Reddit. Use this when you want the actor to discover relevant posts for you. Ignored if urls is provided.
sortstringRanking method for search results. Allowed values: relevance, hot, top, new, comments.relevance
timeframestringReddit-side time window for search results. Allowed values: all, year, month, week, day, hour. Use dateFrom and dateTo for exact record-level filtering.all
dateFromstringExact lower date bound for posts. Accepts ISO-8601 datetimes or YYYY-MM-DD. Plain dates are normalized to the start of the day in UTC.
dateTostringExact upper date bound for posts. Accepts ISO-8601 datetimes or YYYY-MM-DD. Plain dates are normalized to the end of the day in UTC.
subredditNamestringName of the subreddit to scrape, without the r/ prefix. Use this to focus collection on one community.
subredditKeywordsstring[]Optional keywords to narrow results inside the selected subreddit. Leave empty to collect a broader subreddit feed.
subredditSortstringRanking method for subreddit results. Allowed values: relevance, hot, top, new, comments.relevance
subredditTimeframestringReddit-side time window for subreddit results. Allowed values: all, year, month, week, day, hour. Use dateFrom and dateTo for exact record-level filtering.all
urlsstring[]Direct Reddit URLs to scrape, such as post URLs, subreddit pages, user pages, or Reddit search pages. When provided, URL input takes priority over search queries.
scrapeCommentsbooleanWhen enabled, the actor also saves comments from each collected post. Useful for sentiment analysis, deeper discussion review, and thread-level context.false
sentiment_analysisbooleanWhen enabled, the actor analyzes each post's title and body and each scraped comment's body with the AFINN-165 sentiment lexicon and adds sentiment_score plus sentiment_label (positive, negative, or neutral) to those records.false
content_analysisbooleanWhen enabled, the actor classifies each post against a bundled snapshot and adds content_category_label and content_category_path to post records.false
maxCommentsintegerMaximum number of comments to collect per post when comment collection is enabled. Lower values help keep runs faster and datasets smaller.50000
commentDateFromstringExact lower date bound for comments. Accepts ISO-8601 datetimes or YYYY-MM-DD. Use it when comments should follow their own date window.
commentDateTostringExact upper date bound for comments. Accepts ISO-8601 datetimes or YYYY-MM-DD. Use it when you want comment collection to stop at a separate cutoff date.
maximize_coveragebooleanTurn on the actor's internal high-coverage mode. The actor will automatically fan out each search seed across hidden strategies such as chronological slices, search-type variants, and subreddit-focused follow-ups when useful.false
forceSortNewForTimeFilteredRunsbooleanWhen enabled and a post time filter is active, eligible search and listing requests are fetched with sort=new to improve chronological coverage and make safe early stopping possible. This intentionally changes Reddit ranking behavior for those requests.false
includeNsfwbooleanInclude posts marked as NSFW or 18+ in the output dataset.false
strictSearchbooleanMake Reddit search follow your keywords more closely, relying more on exact keywords and less on loose semantic matching. This usually returns fewer posts, but keeps results closer to your query.false
strictTokenFilterbooleanMake the actor scan each saved post's title, body, and URL and keep only posts that match all of your query keywords. This reduces output size, but keeps the most accurate results.false
maxPostsintegerMaximum number of posts to collect for each query or URL input. Use smaller values for quick tests and larger values for deeper research.50000

Example Inputs

Scenario: query-driven monitoring

{
"queries": ["ai video generator", "synthetic media"],
"sort": "new",
"timeframe": "week",
"dateFrom": "2026-03-01",
"dateTo": "2026-03-31",
"maximize_coverage": true,
"strictSearch": true,
"strictTokenFilter": true,
"maxPosts": 500
}

Scenario: direct URL collection

{
"urls": [
"https://www.reddit.com/r/technology/",
"https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/"
],
"scrapeComments": true,
"sentiment_analysis": true,
"content_analysis": true,
"maxComments": 200,
"forceSortNewForTimeFilteredRuns": true,
"commentDateFrom": "2026-04-01T00:00:00Z",
"commentDateTo": "2026-04-07T23:59:59Z",
"maxPosts": 150
}

Scenario: targeted subreddit run

{
"subredditName": "startups",
"subredditKeywords": ["pricing", "customer acquisition"],
"subredditSort": "new",
"subredditTimeframe": "month",
"dateFrom": "2026-02-15",
"commentDateFrom": "2026-03-01",
"scrapeComments": true,
"maxComments": 100,
"maxPosts": 300
}

Output

Output destination

The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.

Record envelope (all items)

Every record includes a stable category field, a Reddit identifier, and a canonical URL:

  • type (string, required): Logical record type, such as post or comment. In the JSON output examples below, this category is represented by the kind field.
  • id (string, required): Stable Reddit identifier for the entity.
  • url (string, required): Canonical Reddit URL for the record.

Recommended idempotency key: type + ":" + id

Use this key for deduplication and upserts, especially when the same Reddit entity appears in overlapping queries, subreddit runs, or direct URL inputs.

Examples

Example: Post (type = "post")

{
"kind": "post",
"query": "cheesecake",
"id": "1hvoazn",
"title": "My best cheesecake so far",
"body": "Found my new favorite recipe (no water bath). Next time I will make a thicker crust. Added a raspberry compote.",
"sentiment_score": 2,
"sentiment_label": "positive",
"content_category_label": "Desserts and Baking",
"content_category_path": ["Food & Drink", "Desserts and Baking"],
"author": "ClearlyBulky",
"score": 3489,
"upvote_ratio": 1,
"num_comments": 43,
"subreddit": "Baking",
"created_utc": "2025-01-07T10:09:56.000Z",
"url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"canonical_url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"old_reddit_url": "https://old.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"flair": "Recipe",
"post_hint": "link",
"over_18": false,
"is_self": false,
"spoiler": false,
"locked": false,
"is_video": false,
"is_gallery": true,
"hidden": false,
"edited": false,
"archived": false,
"pinned": false,
"domain": "old.reddit.com",
"thumbnail": "https://b.thumbs.redditmedia.com/j8wz80MKqfkXuGMuWng9N1DxR6vxRol8W6RAqzdE35A.jpg",
"url_overridden_by_dest": "https://www.reddit.com/gallery/1hvoazn",
"num_duplicates": 0,
"subreddit_id": "t5_2qx1h",
"subreddit_name_prefixed": "r/Baking",
"subreddit_subscribers": 4322940,
"media": null,
"media_metadata": {
"kny1nmhlqjbe1": {
"status": "valid",
"e": "Image",
"m": "image/jpg",
"p": [
{
"y": 144,
"x": 108,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97"
},
{
"y": 288,
"x": 216,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4"
},
{
"y": 426,
"x": 320,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5"
},
{
"y": 853,
"x": 640,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12"
},
{
"y": 1280,
"x": 960,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e"
},
{
"y": 1440,
"x": 1080,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"
}
],
"s": {
"y": 4032,
"x": 3024,
"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383"
},
"id": "kny1nmhlqjbe1"
},
"wjqc6mhlqjbe1": {
"status": "valid",
"e": "Image",
"m": "image/jpg",
"p": [
{
"y": 144,
"x": 108,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271"
},
{
"y": 288,
"x": 216,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982"
},
{
"y": 426,
"x": 320,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144"
},
{
"y": 853,
"x": 640,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8"
},
{
"y": 1280,
"x": 960,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500"
},
{
"y": 1440,
"x": 1080,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"
}
],
"s": {
"y": 4032,
"x": 3024,
"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03"
},
"id": "wjqc6mhlqjbe1"
}
},
"gallery_data": {
"items": [
{
"is_deleted": false,
"media_id": "kny1nmhlqjbe1",
"id": 581711947
},
{
"is_deleted": false,
"media_id": "wjqc6mhlqjbe1",
"id": 581711948
}
]
},
"gallery_images": [
{
"media_id": "kny1nmhlqjbe1",
"caption": "",
"width": 3024,
"height": 4032,
"url": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383",
"previews": [
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"
]
},
{
"media_id": "wjqc6mhlqjbe1",
"caption": "",
"width": 3024,
"height": 4032,
"url": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03",
"previews": [
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"
]
}
],
"media_assets": [
{
"type": "Image",
"media_id": "kny1nmhlqjbe1",
"mime_type": "image/jpg",
"original_url": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383",
"preview_urls": [
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e",
"https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"
]
},
{
"type": "Image",
"media_id": "wjqc6mhlqjbe1",
"mime_type": "image/jpg",
"original_url": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03",
"preview_urls": [
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500",
"https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"
]
}
],
"age_hours": 10916.1333,
"retrieved_at": "2026-04-07T00:00:00.000Z",
"media_type": "gallery",
"has_media": true,
"gallery_count": 2,
"outbound_url_host": "www.reddit.com",
"title_length": 26,
"body_length": 112,
"word_count": 25,
"score_per_hour": 0.3196,
"comments_per_hour": 0.0039,
"is_deleted_or_removed": false,
"engagement_total": 3532,
"comment_to_score_ratio": 0.0123,
"is_high_engagement": true,
"content_flags": [],
"stickied": false,
"distinguished": null,
"total_awards_received": 0,
"all_awardings": [],
"gilded": 0,
"num_crossposts": 0,
"is_original_content": false,
"author_fullname": "t2_dr3vyilor",
"author_flair_text": null,
"author_premium": false,
"body_html": "<!-- SC_OFF --><div class=\"md\"><p>Found my new favorite recipe (no water bath). Next time I will make a thicker crust. Added a raspberry compote.</p>\n</div><!-- SC_ON -->",
"preview": null,
"secure_media": null,
"secure_media_embed": {},
"crosspost_parent_list": null
}

Example: Comment (type = "comment")

{
"kind": "comment",
"query": "cheesecake",
"id": "m5un6bj",
"postId": "1hvoazn",
"postUrl": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/",
"parentId": "t3_1hvoazn",
"body": "\n\n* 9” Springform Pan\n\nIngredients\n\nfor the graham cracker crust-\n\n* 1 1/4 cups graham cracker crumbs\n* 4 tablespoons granulated sugar\n* 5 tablespoons melted butter\n\nfor the cheesecake filling-\n\n* 40 ounces cream cheese at room temperature (five 8 oz. packages; 2 1/2 lbs total)\n* 1 1/4 cups granulated sugar\n* 1/2 cup sour cream at room temperature\n* 2 teaspoons vanilla extract\n* 4 large eggs at room temperature\n* any desired cheesecake toppings\n\nInstructions\n\n* Place oven racks in the center of the oven. Preheat oven to 350° F.\n* In a medium sized bowl, stir graham cracker crumbs together with sugar and melted butter until well incorporated and mixture looks like damp sand. Using the bottom of a measuring cup, press crust into the bottom and half way up the sides of a 9-inch springform pan. Bake 7 minutes. Remove from oven and set aside.\n* Reduce oven temperature to 325° F.\n* In a large bowl or bowl of a stand mixer, mix cream cheese 30 seconds ‘til smooth. Scrape the sides and bottom of the bowl and add in granulated sugar, sour cream and vanilla. Mix again until incorporated. Scrape the sides and bottom of the bowl and mix again briefly.\n* Crack eggs into a liquid measuring cup and using a fork, beat until well scrambled. With the mixer on low, slowly pour in the eggs into the cream cheese mixture and stop stirring once eggs have been incorporated. Remove bowl from mixer and scrape the sides and bottom again, ensuring the entire mixture is smooth. If there are a few small lumps, try to fold in using the rubber scraper.\n* Once the batter is completely smooth and ready, tap the bowl on the counter for 30-45 seconds to remove as many air bubbles as possible. You should see them popping on the surface as you tap the bowl. Pour filling into the center of the graham cracker crust and gently smooth the top. Will be very full!\n* Bake for 30 minutes at 325° F. Reduce temperature to 250° F and continue cooking for 45 minutes more. Once this time has elapsed, turn oven off and keep cheesecake inside for another 30 minutes for some carryover cooking without opening the oven door. Crack oven door to let cheesecake cool slowly for one hour before removing. At this point, cheesecake should be slightly warm. Bring cheesecake to room temperature on the counter (3-4 hours) before covering with plastic wrap and transferring to the fridge.\n* Refrigerate until chilled completely (6 hours to overnight). To serve, open springform pan and remove collar. Decorate as desired. Dip a sharp knife into hot water, wipe off any excess water and slice. I like to dip my knife in water between each slice to get really clean-looking pieces. \n\nNotes\n\nIf you would like a thicker graham cracker crust, use 1 3/4 cups graham cracker crumbs, 5 tablespoons granulated sugar and 6 tablespoons melted butter. Press into the pan and bake for 8 minutes.",
"author": "ClearlyBulky",
"score": 76,
"created_utc": "2025-01-07T10:13:48.000Z",
"url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/",
"permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/",
"canonical_url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/",
"old_reddit_url": "https://old.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/",
"root_comment_id": "m5un6bj",
"parent_kind": "post",
"comment_permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/",
"author_deleted": false,
"body_deleted": false,
"stickied": false,
"distinguished": null,
"is_submitter": true,
"score_hidden": false,
"controversiality": 0,
"depth": 0
}

Field reference

Post fields (type = "post")

  • kind (string, required): Record category for post records.
  • query (string, optional): Input query or source label that produced the record.
  • id (string, required): Stable Reddit post identifier.
  • title (string, required): Post title.
  • body (string, optional): Post body text.
  • sentiment_score (number, optional): AFINN-165 sentiment score computed from the post title and body when sentiment_analysis is enabled.
  • sentiment_label (string, optional): positive, negative, or neutral label derived from sentiment_score when sentiment_analysis is enabled.
  • content_category_label (string, optional): Readable content category name chosen for the post when content_analysis is enabled.
  • content_category_path (array, optional): Topic path from the top-level content category down to the matched post category when content_analysis is enabled.
  • author (string, optional): Username shown on the post.
  • score (number, optional): Post score at collection time.
  • upvote_ratio (number, optional): Upvote ratio when available.
  • num_comments (number, optional): Comment count shown on the post.
  • subreddit (string, optional): Subreddit name.
  • created_utc (string, optional): Post creation time in ISO format. This is the timestamp used for exact post date filtering when dateFrom or dateTo is provided.
  • url (string, required): Canonical Reddit URL for the post.
  • permalink (string, optional): Relative Reddit permalink.
  • canonical_url (string, optional): Canonical full URL.
  • old_reddit_url (string, optional): Alternate legacy Reddit URL.
  • flair (string, optional): Post flair text.
  • post_hint (string, optional): Reddit post hint, useful for downstream media classification.
  • over_18 (boolean, optional): Whether the post is marked NSFW.
  • is_self (boolean, optional): Whether the post is a self post.
  • spoiler (boolean, optional): Whether the post is marked as a spoiler.
  • locked (boolean, optional): Whether the post is locked.
  • is_video (boolean, optional): Whether the post is a video post.
  • is_gallery (boolean, optional): Whether the post is a Reddit gallery.
  • hidden (boolean, optional): Whether the post is hidden for the viewing account.
  • edited (boolean | number, optional): false when untouched, otherwise Reddit's edited timestamp payload.
  • archived (boolean, optional): Whether the post is archived.
  • pinned (boolean, optional): Whether the post is pinned in the subreddit.
  • domain (string, optional): Source or linked domain.
  • thumbnail (string, optional): Thumbnail URL or Reddit thumbnail marker.
  • url_overridden_by_dest (string, optional): Final outbound destination URL when present.
  • num_duplicates (number, optional): Duplicate count reported by Reddit.
  • subreddit_id (string, optional): Internal Reddit subreddit reference.
  • subreddit_name_prefixed (string, optional): Prefixed subreddit label such as r/Baking.
  • subreddit_subscribers (number, optional): Subscriber count at collection time.
  • media (object, optional): Media object when available.
  • media_metadata (object, optional): Raw media metadata keyed by media ID.
  • media_metadata.<media_id>.status (string, optional): Media validity state.
  • media_metadata.<media_id>.e (string, optional): Media asset type label.
  • media_metadata.<media_id>.m (string, optional): Media MIME type.
  • media_metadata.<media_id>.p (array, optional): Preview image variants.
  • media_metadata.<media_id>.p[].y (number, optional): Preview height.
  • media_metadata.<media_id>.p[].x (number, optional): Preview width.
  • media_metadata.<media_id>.p[].u (string, optional): Preview URL.
  • media_metadata.<media_id>.s.y (number, optional): Original media height.
  • media_metadata.<media_id>.s.x (number, optional): Original media width.
  • media_metadata.<media_id>.s.u (string, optional): Original media URL.
  • media_metadata.<media_id>.id (string, optional): Media asset identifier.
  • gallery_data (object, optional): Reddit gallery metadata.
  • gallery_data.items (array, optional): Gallery item list.
  • gallery_data.items[].is_deleted (boolean, optional): Whether the gallery item is deleted.
  • gallery_data.items[].media_id (string, optional): Gallery media identifier.
  • gallery_data.items[].id (number, optional): Gallery item identifier.
  • gallery_images (array, optional): Normalized gallery image list.
  • gallery_images[].media_id (string, optional): Gallery media identifier.
  • gallery_images[].caption (string, optional): Image caption text.
  • gallery_images[].width (number, optional): Image width.
  • gallery_images[].height (number, optional): Image height.
  • gallery_images[].url (string, optional): Original image URL.
  • gallery_images[].previews (array, optional): Preview image URLs.
  • media_assets (array, optional): Normalized media asset list.
  • media_assets[].type (string, optional): Media type label.
  • media_assets[].media_id (string, optional): Media asset identifier.
  • media_assets[].mime_type (string, optional): Media MIME type.
  • media_assets[].original_url (string, optional): Original media URL.
  • media_assets[].preview_urls (array, optional): Preview URLs for the asset.
  • age_hours (number, optional): Post age in hours at collection time.
  • retrieved_at (string, optional): Actor capture time in ISO format.
  • media_type (string, optional): Normalized media class: text, image, gallery, video, gif, or link.
  • has_media (boolean, optional): Convenience flag for image, gallery, GIF, or video posts.
  • gallery_count (number, optional): Number of normalized gallery images.
  • outbound_url_host (string, optional): Parsed host from url_overridden_by_dest when present.
  • title_length (number, optional): Character length of the title.
  • body_length (number, optional): Character length of the body text.
  • word_count (number, optional): Lightweight whitespace-based word count across title and body.
  • score_per_hour (number, optional): Score divided by post age with a minimum age floor to avoid division by zero.
  • comments_per_hour (number, optional): Comment count divided by post age with a minimum age floor to avoid division by zero.
  • is_deleted_or_removed (boolean, optional): Conservative deletion/removal flag derived from visible placeholders and removal metadata.
  • engagement_total (number, optional): Combined engagement metric derived from score and comments.
  • comment_to_score_ratio (number, optional): Comments-to-score ratio.
  • is_high_engagement (boolean, optional): Convenience flag for high engagement.
  • content_flags (array, optional): Content classification flags when present.
  • stickied (boolean, optional): Whether the post is pinned.
  • distinguished (string, optional): Distinguishing label, such as moderator status.
  • total_awards_received (number, optional): Total awards on the post.
  • all_awardings (array, optional): Raw awards list.
  • gilded (number, optional): Gilding count.
  • num_crossposts (number, optional): Number of crossposts.
  • is_original_content (boolean, optional): Whether the post is marked original content.
  • author_fullname (string, optional): Internal Reddit author reference when available.
  • author_flair_text (string, optional): Author flair text.
  • author_premium (boolean, optional): Whether the author has premium status.
  • body_html (string, optional): HTML-formatted post body.
  • preview (object, optional): Preview object when available.
  • secure_media (object, optional): Secure media object when available.
  • secure_media_embed (object, optional): Secure media embed metadata.
  • crosspost_parent_list (array, optional): Crosspost parent data when available.

Comment fields (type = "comment")

  • kind (string, required): Record category for comment records.
  • query (string, optional): Input query or source label that produced the record.
  • id (string, required): Stable Reddit comment identifier.
  • postId (string, required): Parent post identifier.
  • postUrl (string, required): Parent post URL.
  • parentId (string, required): Parent Reddit object identifier.
  • body (string, optional): Comment body text.
  • sentiment_score (number, optional): AFINN-165 sentiment score computed from the comment body when sentiment_analysis is enabled.
  • sentiment_label (string, optional): positive, negative, or neutral label derived from comment sentiment_score when sentiment_analysis is enabled.
  • author (string, optional): Username shown on the comment.
  • score (number, optional): Comment score at collection time.
  • created_utc (string, optional): Comment creation time in ISO format. This is the timestamp used for exact comment date filtering when commentDateFrom or commentDateTo is provided.
  • url (string, required): Canonical Reddit URL for the comment.
  • permalink (string, optional): Relative Reddit permalink.
  • canonical_url (string, optional): Canonical full URL.
  • old_reddit_url (string, optional): Alternate legacy Reddit URL.
  • root_comment_id (string, optional): Root comment ID for the thread.
  • parent_kind (string, optional): Parent record type, such as post or comment.
  • comment_permalink (string, optional): Relative permalink for the comment.
  • author_deleted (boolean, optional): Whether the author is deleted.
  • body_deleted (boolean, optional): Whether the comment body is deleted or removed.
  • stickied (boolean, optional): Whether the comment is pinned.
  • distinguished (string, optional): Distinguishing label, such as moderator status.
  • is_submitter (boolean, optional): Whether the author is the original post creator.
  • score_hidden (boolean, optional): Whether the score is hidden.
  • controversiality (number, optional): Reddit controversiality indicator.
  • depth (number, optional): Nesting depth in the comment tree.

Data guarantees & handling

  • Best-effort extraction: fields may vary by region, session, availability, and Reddit surface changes or experiments.
  • Optional fields: always null-check in downstream code because many fields may be empty or unavailable.
  • Time filtering: timeframe and subredditTimeframe narrow the Reddit source query, while dateFrom / dateTo and commentDateFrom / commentDateTo apply exact record-level filtering in the actor output.
  • Chronological coverage: preserving top, hot, relevance, or comments does not guarantee that Reddit returns a strictly time-ordered feed for a date range. If stronger time-window traversal matters more than preserving ranking semantics, enable forceSortNewForTimeFilteredRuns.
  • Maximum coverage mode: when maximize_coverage=true, the actor expands search seeds internally to improve recall. This mode overrides forceSortNewForTimeFilteredRuns for those search seeds and may use more requests, time slices, and overlap-safe deduplication behind the scenes.
  • Deduplication: recommend type + ":" + id.
  • Stable identifiers make inter-seed deduplication and upserts straightforward when the same entity is discovered through overlapping inputs.

How to Run on Apify

  1. Open the Actor in Apify Console.
  2. Configure your search parameters, such as keywords, subreddit name, direct Reddit URLs, ranking options, and filters.
  3. Set the maximum number of outputs to collect using maxPosts and, if needed, maxComments.
  4. Click Start and wait for the run to finish.
  5. Download results in JSON, CSV, Excel, or other supported formats.

Scheduling & Automation

Scheduling

Automated Data Collection

You can schedule recurring runs to keep your Reddit dataset current without manual work. This is useful for monitoring trends, tracking brand mentions, and maintaining fresh inputs for dashboards or data pipelines.

  • Navigate to Schedules in Apify Console
  • Create a new schedule (daily, weekly, or custom cron)
  • Configure input parameters
  • Enable notifications for run completion
  • Optional: add webhooks for automated processing

Integration Options

  • Webhooks: Trigger downstream actions when a run completes
  • Zapier: Connect to 5,000+ apps without coding
  • Make (Integromat): Build multi-step automation workflows
  • Google Sheets: Export results to a spreadsheet
  • Slack/Discord: Receive notifications and summaries
  • Email: Send automated reports via email

Performance

Estimated run times:

  • Small runs (< 1,000 outputs): ~2-3 minutes
  • Medium runs (1,000-5,000 outputs): ~5-15 minutes
  • Large runs (5,000+ outputs): ~15-30 minutes

For planning purposes, many runs targeting around 1,000 outputs fall into the small-run range, but execution time varies based on filters, result volume, and how much information is returned per record.

Compliance & Ethics

Responsible Data Collection

This actor collects publicly available Reddit posts, comments, and discussion metadata from https://www.reddit.com for legitimate business purposes, including:

  • consumer research and market analysis
  • brand monitoring and competitive tracking
  • product feedback discovery and trend analysis

Users are responsible for making sure their use of the collected data complies with applicable laws, regulations, internal policies, and the target site's terms. This section is informational and not legal advice.

Best Practices

  • Use collected data in accordance with applicable laws, regulations, and the target site's terms
  • Respect individual privacy and personal information
  • Use data responsibly and avoid disruptive or excessive collection
  • Do not use this actor for spamming, harassment, or other harmful purposes
  • Follow relevant data protection requirements where applicable, such as GDPR and CCPA

Support

For help, use the Issues tab or the actor page on Apify. Include the input you used with sensitive values redacted, the run ID, the expected behavior versus the actual behavior, and, if helpful, a small output sample.