Reddit Scraper | All-In-One | $1.5 / 1K
Pricing
$1.49 / 1,000 results
Reddit Scraper | All-In-One | $1.5 / 1K
Extract Reddit posts and full comment threads from searches, subreddits, user pages, and direct post URLs. Built for enterprise-grade speed, richest-in-class data coverage, advanced filtering, and clean JSON for market intelligence, sentiment analysis, and analytics.
Pricing
$1.49 / 1,000 results
Rating
3.9
(9)
Developer
Fatih Tahta
Actor stats
83
Bookmarked
2.1K
Total users
345
Monthly active users
13 hours
Issues response
2 days ago
Last modified
Categories
Share
Reddit Scraper Enterprise Grade
Slug: fatihtahta/reddit-scraper-search-fast
Overview
Reddit Scraper collects publicly available Reddit posts and, when enabled, comment records with the metadata teams usually need for analysis, enrichment, and automation. Each record can include core content, authorship, engagement metrics, subreddit context, timestamps, canonical URLs, and media-related fields when available. Reddit is one of the web's richest sources of real-world opinions, product feedback, community discussion, and trend signals, which makes it valuable for research, monitoring, and downstream decision-making. This actor turns that information into consistent dataset records so teams can replace repetitive manual collection with repeatable runs. It is designed for production use with stable identifiers, overlap-friendly outputs for inter-seed deduplication, and reliable scheduled collection workflows.
Why Use This Actor
- Market research and analytics teams: Track conversation volume, engagement, subreddit activity, and topic trends across keywords, communities, and time windows.
- Product and content teams: Discover user pain points, feature requests, language patterns, and high-performing discussion themes for roadmap and editorial planning.
- Developers and data engineering teams: Feed Reddit data into ETL pipelines, warehouses, dashboards, and APIs using structured JSON records that are easy to upsert and model.
- Lead generation and enrichment teams: Identify relevant communities, active discussions, and context around buyer interests, brand mentions, and niche topics.
- Monitoring and competitive intelligence teams: Watch competitor mentions, category shifts, launch reactions, and recurring discussion spikes without manually checking Reddit every day.
- Operations and automation teams: Run recurring jobs on a schedule and use stable record keys to deduplicate overlapping results from queries, subreddit searches, and direct URLs.
Input Parameters
Provide any combination of URLs, queries, and filters to control what the actor collects and how focused the results should be.
| Parameter | Type | Description | Default |
|---|---|---|---|
queries | string[] | Search phrases to look for across Reddit. Use this when you want the actor to discover relevant posts for you. Ignored if urls is provided. | – |
sort | string | Ranking method for search results. Allowed values: relevance, hot, top, new, comments. | relevance |
timeframe | string | Reddit-side time window for search results. Allowed values: all, year, month, week, day, hour. Use dateFrom and dateTo for exact record-level filtering. | all |
dateFrom | string | Exact lower date bound for posts. Accepts ISO-8601 datetimes or YYYY-MM-DD. Plain dates are normalized to the start of the day in UTC. | – |
dateTo | string | Exact upper date bound for posts. Accepts ISO-8601 datetimes or YYYY-MM-DD. Plain dates are normalized to the end of the day in UTC. | – |
subredditName | string | Name of the subreddit to scrape, without the r/ prefix. Use this to focus collection on one community. | – |
subredditKeywords | string[] | Optional keywords to narrow results inside the selected subreddit. Leave empty to collect a broader subreddit feed. | – |
subredditSort | string | Ranking method for subreddit results. Allowed values: relevance, hot, top, new, comments. | relevance |
subredditTimeframe | string | Reddit-side time window for subreddit results. Allowed values: all, year, month, week, day, hour. Use dateFrom and dateTo for exact record-level filtering. | all |
urls | string[] | Direct Reddit URLs to scrape, such as post URLs, subreddit pages, user pages, or Reddit search pages. When provided, URL input takes priority over search queries. | – |
scrapeComments | boolean | When enabled, the actor also saves comments from each collected post. Useful for sentiment analysis, deeper discussion review, and thread-level context. | false |
sentiment_analysis | boolean | When enabled, the actor analyzes each post's title and body and each scraped comment's body with the AFINN-165 sentiment lexicon and adds sentiment_score plus sentiment_label (positive, negative, or neutral) to those records. | false |
content_analysis | boolean | When enabled, the actor classifies each post against a bundled snapshot and adds content_category_label and content_category_path to post records. | false |
maxComments | integer | Maximum number of comments to collect per post when comment collection is enabled. Lower values help keep runs faster and datasets smaller. | 50000 |
commentDateFrom | string | Exact lower date bound for comments. Accepts ISO-8601 datetimes or YYYY-MM-DD. Use it when comments should follow their own date window. | – |
commentDateTo | string | Exact upper date bound for comments. Accepts ISO-8601 datetimes or YYYY-MM-DD. Use it when you want comment collection to stop at a separate cutoff date. | – |
maximize_coverage | boolean | Turn on the actor's internal high-coverage mode. The actor will automatically fan out each search seed across hidden strategies such as chronological slices, search-type variants, and subreddit-focused follow-ups when useful. | false |
forceSortNewForTimeFilteredRuns | boolean | When enabled and a post time filter is active, eligible search and listing requests are fetched with sort=new to improve chronological coverage and make safe early stopping possible. This intentionally changes Reddit ranking behavior for those requests. | false |
includeNsfw | boolean | Include posts marked as NSFW or 18+ in the output dataset. | false |
strictSearch | boolean | Make Reddit search follow your keywords more closely, relying more on exact keywords and less on loose semantic matching. This usually returns fewer posts, but keeps results closer to your query. | false |
strictTokenFilter | boolean | Make the actor scan each saved post's title, body, and URL and keep only posts that match all of your query keywords. This reduces output size, but keeps the most accurate results. | false |
maxPosts | integer | Maximum number of posts to collect for each query or URL input. Use smaller values for quick tests and larger values for deeper research. | 50000 |
Example Inputs
Scenario: query-driven monitoring
{"queries": ["ai video generator", "synthetic media"],"sort": "new","timeframe": "week","dateFrom": "2026-03-01","dateTo": "2026-03-31","maximize_coverage": true,"strictSearch": true,"strictTokenFilter": true,"maxPosts": 500}
Scenario: direct URL collection
{"urls": ["https://www.reddit.com/r/technology/","https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/"],"scrapeComments": true,"sentiment_analysis": true,"content_analysis": true,"maxComments": 200,"forceSortNewForTimeFilteredRuns": true,"commentDateFrom": "2026-04-01T00:00:00Z","commentDateTo": "2026-04-07T23:59:59Z","maxPosts": 150}
Scenario: targeted subreddit run
{"subredditName": "startups","subredditKeywords": ["pricing", "customer acquisition"],"subredditSort": "new","subredditTimeframe": "month","dateFrom": "2026-02-15","commentDateFrom": "2026-03-01","scrapeComments": true,"maxComments": 100,"maxPosts": 300}
Output
Output destination
The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.
Record envelope (all items)
Every record includes a stable category field, a Reddit identifier, and a canonical URL:
- type (string, required): Logical record type, such as
postorcomment. In the JSON output examples below, this category is represented by thekindfield. - id (string, required): Stable Reddit identifier for the entity.
- url (string, required): Canonical Reddit URL for the record.
Recommended idempotency key: type + ":" + id
Use this key for deduplication and upserts, especially when the same Reddit entity appears in overlapping queries, subreddit runs, or direct URL inputs.
Examples
Example: Post (type = "post")
{"kind": "post","query": "cheesecake","id": "1hvoazn","title": "My best cheesecake so far","body": "Found my new favorite recipe (no water bath). Next time I will make a thicker crust. Added a raspberry compote.","sentiment_score": 2,"sentiment_label": "positive","content_category_label": "Desserts and Baking","content_category_path": ["Food & Drink", "Desserts and Baking"],"author": "ClearlyBulky","score": 3489,"upvote_ratio": 1,"num_comments": 43,"subreddit": "Baking","created_utc": "2025-01-07T10:09:56.000Z","url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/","permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/","canonical_url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/","old_reddit_url": "https://old.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/","flair": "Recipe","post_hint": "link","over_18": false,"is_self": false,"spoiler": false,"locked": false,"is_video": false,"is_gallery": true,"hidden": false,"edited": false,"archived": false,"pinned": false,"domain": "old.reddit.com","thumbnail": "https://b.thumbs.redditmedia.com/j8wz80MKqfkXuGMuWng9N1DxR6vxRol8W6RAqzdE35A.jpg","url_overridden_by_dest": "https://www.reddit.com/gallery/1hvoazn","num_duplicates": 0,"subreddit_id": "t5_2qx1h","subreddit_name_prefixed": "r/Baking","subreddit_subscribers": 4322940,"media": null,"media_metadata": {"kny1nmhlqjbe1": {"status": "valid","e": "Image","m": "image/jpg","p": [{"y": 144,"x": 108,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97"},{"y": 288,"x": 216,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4"},{"y": 426,"x": 320,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5"},{"y": 853,"x": 640,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12"},{"y": 1280,"x": 960,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e"},{"y": 1440,"x": 1080,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"}],"s": {"y": 4032,"x": 3024,"u": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383"},"id": "kny1nmhlqjbe1"},"wjqc6mhlqjbe1": {"status": "valid","e": "Image","m": "image/jpg","p": [{"y": 144,"x": 108,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271"},{"y": 288,"x": 216,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982"},{"y": 426,"x": 320,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144"},{"y": 853,"x": 640,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8"},{"y": 1280,"x": 960,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500"},{"y": 1440,"x": 1080,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"}],"s": {"y": 4032,"x": 3024,"u": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03"},"id": "wjqc6mhlqjbe1"}},"gallery_data": {"items": [{"is_deleted": false,"media_id": "kny1nmhlqjbe1","id": 581711947},{"is_deleted": false,"media_id": "wjqc6mhlqjbe1","id": 581711948}]},"gallery_images": [{"media_id": "kny1nmhlqjbe1","caption": "","width": 3024,"height": 4032,"url": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383","previews": ["https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"]},{"media_id": "wjqc6mhlqjbe1","caption": "","width": 3024,"height": 4032,"url": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03","previews": ["https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"]}],"media_assets": [{"type": "Image","media_id": "kny1nmhlqjbe1","mime_type": "image/jpg","original_url": "https://preview.redd.it/kny1nmhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=cdd808d85ed2306d4f55c858ba4b7bb811a3a383","preview_urls": ["https://preview.redd.it/kny1nmhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=212ea9ba4b561f967c673570845bd9591a44fe97","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=1517c122656719b2bc8ec82e0538ebba890032e4","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=db43bab7cc24f6caf55b5a8729ddee2a457f8be5","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=262c1178fe0f5c3d902c9f413a10b8d8c0e94a12","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=78980947a1a39dbe31b5b4c924989f1058a34d2e","https://preview.redd.it/kny1nmhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=4ad55e5b23d4e7d64e9b7d56bf7977357d55a642"]},{"type": "Image","media_id": "wjqc6mhlqjbe1","mime_type": "image/jpg","original_url": "https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=3024&format=pjpg&auto=webp&s=284e05e3ad1593eba99c76b4a603dab5c0c28a03","preview_urls": ["https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=108&crop=smart&auto=webp&s=f36383919420db57600bf4d290eba35192246271","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=216&crop=smart&auto=webp&s=9847d0ca21ddbc05651128a208518f462aa85982","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=320&crop=smart&auto=webp&s=73a44f697e66e057ba974f6f02427e7c5b04f144","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=640&crop=smart&auto=webp&s=085e2982877df777c1104156699b7f523516fab8","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=960&crop=smart&auto=webp&s=e9afe86434cb058cbc8775e8a9e0788a6ca23500","https://preview.redd.it/wjqc6mhlqjbe1.jpg?width=1080&crop=smart&auto=webp&s=82a1c00a718717aee33cde17c732529d15f3be54"]}],"age_hours": 10916.1333,"retrieved_at": "2026-04-07T00:00:00.000Z","media_type": "gallery","has_media": true,"gallery_count": 2,"outbound_url_host": "www.reddit.com","title_length": 26,"body_length": 112,"word_count": 25,"score_per_hour": 0.3196,"comments_per_hour": 0.0039,"is_deleted_or_removed": false,"engagement_total": 3532,"comment_to_score_ratio": 0.0123,"is_high_engagement": true,"content_flags": [],"stickied": false,"distinguished": null,"total_awards_received": 0,"all_awardings": [],"gilded": 0,"num_crossposts": 0,"is_original_content": false,"author_fullname": "t2_dr3vyilor","author_flair_text": null,"author_premium": false,"body_html": "<!-- SC_OFF --><div class=\"md\"><p>Found my new favorite recipe (no water bath). Next time I will make a thicker crust. Added a raspberry compote.</p>\n</div><!-- SC_ON -->","preview": null,"secure_media": null,"secure_media_embed": {},"crosspost_parent_list": null}
Example: Comment (type = "comment")
{"kind": "comment","query": "cheesecake","id": "m5un6bj","postId": "1hvoazn","postUrl": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/","parentId": "t3_1hvoazn","body": "\n\n* 9” Springform Pan\n\nIngredients\n\nfor the graham cracker crust-\n\n* 1 1/4 cups graham cracker crumbs\n* 4 tablespoons granulated sugar\n* 5 tablespoons melted butter\n\nfor the cheesecake filling-\n\n* 40 ounces cream cheese at room temperature (five 8 oz. packages; 2 1/2 lbs total)\n* 1 1/4 cups granulated sugar\n* 1/2 cup sour cream at room temperature\n* 2 teaspoons vanilla extract\n* 4 large eggs at room temperature\n* any desired cheesecake toppings\n\nInstructions\n\n* Place oven racks in the center of the oven. Preheat oven to 350° F.\n* In a medium sized bowl, stir graham cracker crumbs together with sugar and melted butter until well incorporated and mixture looks like damp sand. Using the bottom of a measuring cup, press crust into the bottom and half way up the sides of a 9-inch springform pan. Bake 7 minutes. Remove from oven and set aside.\n* Reduce oven temperature to 325° F.\n* In a large bowl or bowl of a stand mixer, mix cream cheese 30 seconds ‘til smooth. Scrape the sides and bottom of the bowl and add in granulated sugar, sour cream and vanilla. Mix again until incorporated. Scrape the sides and bottom of the bowl and mix again briefly.\n* Crack eggs into a liquid measuring cup and using a fork, beat until well scrambled. With the mixer on low, slowly pour in the eggs into the cream cheese mixture and stop stirring once eggs have been incorporated. Remove bowl from mixer and scrape the sides and bottom again, ensuring the entire mixture is smooth. If there are a few small lumps, try to fold in using the rubber scraper.\n* Once the batter is completely smooth and ready, tap the bowl on the counter for 30-45 seconds to remove as many air bubbles as possible. You should see them popping on the surface as you tap the bowl. Pour filling into the center of the graham cracker crust and gently smooth the top. Will be very full!\n* Bake for 30 minutes at 325° F. Reduce temperature to 250° F and continue cooking for 45 minutes more. Once this time has elapsed, turn oven off and keep cheesecake inside for another 30 minutes for some carryover cooking without opening the oven door. Crack oven door to let cheesecake cool slowly for one hour before removing. At this point, cheesecake should be slightly warm. Bring cheesecake to room temperature on the counter (3-4 hours) before covering with plastic wrap and transferring to the fridge.\n* Refrigerate until chilled completely (6 hours to overnight). To serve, open springform pan and remove collar. Decorate as desired. Dip a sharp knife into hot water, wipe off any excess water and slice. I like to dip my knife in water between each slice to get really clean-looking pieces. \n\nNotes\n\nIf you would like a thicker graham cracker crust, use 1 3/4 cups graham cracker crumbs, 5 tablespoons granulated sugar and 6 tablespoons melted butter. Press into the pan and bake for 8 minutes.","author": "ClearlyBulky","score": 76,"created_utc": "2025-01-07T10:13:48.000Z","url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/","permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/","canonical_url": "https://www.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/","old_reddit_url": "https://old.reddit.com/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/","root_comment_id": "m5un6bj","parent_kind": "post","comment_permalink": "/r/Baking/comments/1hvoazn/my_best_cheesecake_so_far/m5un6bj/","author_deleted": false,"body_deleted": false,"stickied": false,"distinguished": null,"is_submitter": true,"score_hidden": false,"controversiality": 0,"depth": 0}
Field reference
Post fields (type = "post")
- kind (string, required): Record category for post records.
- query (string, optional): Input query or source label that produced the record.
- id (string, required): Stable Reddit post identifier.
- title (string, required): Post title.
- body (string, optional): Post body text.
- sentiment_score (number, optional): AFINN-165 sentiment score computed from the post title and body when
sentiment_analysisis enabled. - sentiment_label (string, optional):
positive,negative, orneutrallabel derived fromsentiment_scorewhensentiment_analysisis enabled. - content_category_label (string, optional): Readable content category name chosen for the post when
content_analysisis enabled. - content_category_path (array, optional): Topic path from the top-level content category down to the matched post category when
content_analysisis enabled. - author (string, optional): Username shown on the post.
- score (number, optional): Post score at collection time.
- upvote_ratio (number, optional): Upvote ratio when available.
- num_comments (number, optional): Comment count shown on the post.
- subreddit (string, optional): Subreddit name.
- created_utc (string, optional): Post creation time in ISO format. This is the timestamp used for exact post date filtering when
dateFromordateTois provided. - url (string, required): Canonical Reddit URL for the post.
- permalink (string, optional): Relative Reddit permalink.
- canonical_url (string, optional): Canonical full URL.
- old_reddit_url (string, optional): Alternate legacy Reddit URL.
- flair (string, optional): Post flair text.
- post_hint (string, optional): Reddit post hint, useful for downstream media classification.
- over_18 (boolean, optional): Whether the post is marked NSFW.
- is_self (boolean, optional): Whether the post is a self post.
- spoiler (boolean, optional): Whether the post is marked as a spoiler.
- locked (boolean, optional): Whether the post is locked.
- is_video (boolean, optional): Whether the post is a video post.
- is_gallery (boolean, optional): Whether the post is a Reddit gallery.
- hidden (boolean, optional): Whether the post is hidden for the viewing account.
- edited (boolean | number, optional):
falsewhen untouched, otherwise Reddit's edited timestamp payload. - archived (boolean, optional): Whether the post is archived.
- pinned (boolean, optional): Whether the post is pinned in the subreddit.
- domain (string, optional): Source or linked domain.
- thumbnail (string, optional): Thumbnail URL or Reddit thumbnail marker.
- url_overridden_by_dest (string, optional): Final outbound destination URL when present.
- num_duplicates (number, optional): Duplicate count reported by Reddit.
- subreddit_id (string, optional): Internal Reddit subreddit reference.
- subreddit_name_prefixed (string, optional): Prefixed subreddit label such as
r/Baking. - subreddit_subscribers (number, optional): Subscriber count at collection time.
- media (object, optional): Media object when available.
- media_metadata (object, optional): Raw media metadata keyed by media ID.
- media_metadata.<media_id>.status (string, optional): Media validity state.
- media_metadata.<media_id>.e (string, optional): Media asset type label.
- media_metadata.<media_id>.m (string, optional): Media MIME type.
- media_metadata.<media_id>.p (array, optional): Preview image variants.
- media_metadata.<media_id>.p[].y (number, optional): Preview height.
- media_metadata.<media_id>.p[].x (number, optional): Preview width.
- media_metadata.<media_id>.p[].u (string, optional): Preview URL.
- media_metadata.<media_id>.s.y (number, optional): Original media height.
- media_metadata.<media_id>.s.x (number, optional): Original media width.
- media_metadata.<media_id>.s.u (string, optional): Original media URL.
- media_metadata.<media_id>.id (string, optional): Media asset identifier.
- gallery_data (object, optional): Reddit gallery metadata.
- gallery_data.items (array, optional): Gallery item list.
- gallery_data.items[].is_deleted (boolean, optional): Whether the gallery item is deleted.
- gallery_data.items[].media_id (string, optional): Gallery media identifier.
- gallery_data.items[].id (number, optional): Gallery item identifier.
- gallery_images (array, optional): Normalized gallery image list.
- gallery_images[].media_id (string, optional): Gallery media identifier.
- gallery_images[].caption (string, optional): Image caption text.
- gallery_images[].width (number, optional): Image width.
- gallery_images[].height (number, optional): Image height.
- gallery_images[].url (string, optional): Original image URL.
- gallery_images[].previews (array, optional): Preview image URLs.
- media_assets (array, optional): Normalized media asset list.
- media_assets[].type (string, optional): Media type label.
- media_assets[].media_id (string, optional): Media asset identifier.
- media_assets[].mime_type (string, optional): Media MIME type.
- media_assets[].original_url (string, optional): Original media URL.
- media_assets[].preview_urls (array, optional): Preview URLs for the asset.
- age_hours (number, optional): Post age in hours at collection time.
- retrieved_at (string, optional): Actor capture time in ISO format.
- media_type (string, optional): Normalized media class:
text,image,gallery,video,gif, orlink. - has_media (boolean, optional): Convenience flag for image, gallery, GIF, or video posts.
- gallery_count (number, optional): Number of normalized gallery images.
- outbound_url_host (string, optional): Parsed host from
url_overridden_by_destwhen present. - title_length (number, optional): Character length of the title.
- body_length (number, optional): Character length of the body text.
- word_count (number, optional): Lightweight whitespace-based word count across title and body.
- score_per_hour (number, optional): Score divided by post age with a minimum age floor to avoid division by zero.
- comments_per_hour (number, optional): Comment count divided by post age with a minimum age floor to avoid division by zero.
- is_deleted_or_removed (boolean, optional): Conservative deletion/removal flag derived from visible placeholders and removal metadata.
- engagement_total (number, optional): Combined engagement metric derived from score and comments.
- comment_to_score_ratio (number, optional): Comments-to-score ratio.
- is_high_engagement (boolean, optional): Convenience flag for high engagement.
- content_flags (array, optional): Content classification flags when present.
- stickied (boolean, optional): Whether the post is pinned.
- distinguished (string, optional): Distinguishing label, such as moderator status.
- total_awards_received (number, optional): Total awards on the post.
- all_awardings (array, optional): Raw awards list.
- gilded (number, optional): Gilding count.
- num_crossposts (number, optional): Number of crossposts.
- is_original_content (boolean, optional): Whether the post is marked original content.
- author_fullname (string, optional): Internal Reddit author reference when available.
- author_flair_text (string, optional): Author flair text.
- author_premium (boolean, optional): Whether the author has premium status.
- body_html (string, optional): HTML-formatted post body.
- preview (object, optional): Preview object when available.
- secure_media (object, optional): Secure media object when available.
- secure_media_embed (object, optional): Secure media embed metadata.
- crosspost_parent_list (array, optional): Crosspost parent data when available.
Comment fields (type = "comment")
- kind (string, required): Record category for comment records.
- query (string, optional): Input query or source label that produced the record.
- id (string, required): Stable Reddit comment identifier.
- postId (string, required): Parent post identifier.
- postUrl (string, required): Parent post URL.
- parentId (string, required): Parent Reddit object identifier.
- body (string, optional): Comment body text.
- sentiment_score (number, optional): AFINN-165 sentiment score computed from the comment body when
sentiment_analysisis enabled. - sentiment_label (string, optional):
positive,negative, orneutrallabel derived from commentsentiment_scorewhensentiment_analysisis enabled. - author (string, optional): Username shown on the comment.
- score (number, optional): Comment score at collection time.
- created_utc (string, optional): Comment creation time in ISO format. This is the timestamp used for exact comment date filtering when
commentDateFromorcommentDateTois provided. - url (string, required): Canonical Reddit URL for the comment.
- permalink (string, optional): Relative Reddit permalink.
- canonical_url (string, optional): Canonical full URL.
- old_reddit_url (string, optional): Alternate legacy Reddit URL.
- root_comment_id (string, optional): Root comment ID for the thread.
- parent_kind (string, optional): Parent record type, such as
postorcomment. - comment_permalink (string, optional): Relative permalink for the comment.
- author_deleted (boolean, optional): Whether the author is deleted.
- body_deleted (boolean, optional): Whether the comment body is deleted or removed.
- stickied (boolean, optional): Whether the comment is pinned.
- distinguished (string, optional): Distinguishing label, such as moderator status.
- is_submitter (boolean, optional): Whether the author is the original post creator.
- score_hidden (boolean, optional): Whether the score is hidden.
- controversiality (number, optional): Reddit controversiality indicator.
- depth (number, optional): Nesting depth in the comment tree.
Data guarantees & handling
- Best-effort extraction: fields may vary by region, session, availability, and Reddit surface changes or experiments.
- Optional fields: always null-check in downstream code because many fields may be empty or unavailable.
- Time filtering:
timeframeandsubredditTimeframenarrow the Reddit source query, whiledateFrom/dateToandcommentDateFrom/commentDateToapply exact record-level filtering in the actor output. - Chronological coverage: preserving
top,hot,relevance, orcommentsdoes not guarantee that Reddit returns a strictly time-ordered feed for a date range. If stronger time-window traversal matters more than preserving ranking semantics, enableforceSortNewForTimeFilteredRuns. - Maximum coverage mode: when
maximize_coverage=true, the actor expands search seeds internally to improve recall. This mode overridesforceSortNewForTimeFilteredRunsfor those search seeds and may use more requests, time slices, and overlap-safe deduplication behind the scenes. - Deduplication: recommend
type + ":" + id. - Stable identifiers make inter-seed deduplication and upserts straightforward when the same entity is discovered through overlapping inputs.
How to Run on Apify
- Open the Actor in Apify Console.
- Configure your search parameters, such as keywords, subreddit name, direct Reddit URLs, ranking options, and filters.
- Set the maximum number of outputs to collect using
maxPostsand, if needed,maxComments. - Click Start and wait for the run to finish.
- Download results in JSON, CSV, Excel, or other supported formats.
Scheduling & Automation
Scheduling
Automated Data Collection
You can schedule recurring runs to keep your Reddit dataset current without manual work. This is useful for monitoring trends, tracking brand mentions, and maintaining fresh inputs for dashboards or data pipelines.
- Navigate to Schedules in Apify Console
- Create a new schedule (daily, weekly, or custom cron)
- Configure input parameters
- Enable notifications for run completion
- Optional: add webhooks for automated processing
Integration Options
- Webhooks: Trigger downstream actions when a run completes
- Zapier: Connect to 5,000+ apps without coding
- Make (Integromat): Build multi-step automation workflows
- Google Sheets: Export results to a spreadsheet
- Slack/Discord: Receive notifications and summaries
- Email: Send automated reports via email
Performance
Estimated run times:
- Small runs (< 1,000 outputs): ~2-3 minutes
- Medium runs (1,000-5,000 outputs): ~5-15 minutes
- Large runs (5,000+ outputs): ~15-30 minutes
For planning purposes, many runs targeting around 1,000 outputs fall into the small-run range, but execution time varies based on filters, result volume, and how much information is returned per record.
Compliance & Ethics
Responsible Data Collection
This actor collects publicly available Reddit posts, comments, and discussion metadata from https://www.reddit.com for legitimate business purposes, including:
- consumer research and market analysis
- brand monitoring and competitive tracking
- product feedback discovery and trend analysis
Users are responsible for making sure their use of the collected data complies with applicable laws, regulations, internal policies, and the target site's terms. This section is informational and not legal advice.
Best Practices
- Use collected data in accordance with applicable laws, regulations, and the target site's terms
- Respect individual privacy and personal information
- Use data responsibly and avoid disruptive or excessive collection
- Do not use this actor for spamming, harassment, or other harmful purposes
- Follow relevant data protection requirements where applicable, such as GDPR and CCPA
Support
For help, use the Issues tab or the actor page on Apify. Include the input you used with sensitive values redacted, the run ID, the expected behavior versus the actual behavior, and, if helpful, a small output sample.