Reddit Scraper Lite avatar
Reddit Scraper Lite

Pricing

$5.00/month + usage

Go to Apify Store
Reddit Scraper Lite

Reddit Scraper Lite

Pay-per-result, unlimited Reddit web scraper that crawls posts, comments, communities, and users without requiring login. It allows limiting scraping by the number of posts or items and exports all extracted data into multiple dataset formats

Pricing

$5.00/month + usage

Rating

5.0

(4)

Developer

ScrapeAI

ScrapeAI

Maintained by Community

Actor stats

1

Bookmarked

3

Total users

2

Monthly active users

2 months ago

Last modified

Share

๐Ÿง  Reddit Scraper Lite

Pay Per Result, unlimited Reddit web scraper to crawl posts, comments, communities, and users without login. Limit web scraping by number of posts or items and extract all data in a dataset in multiple formats.

โœ… Use Cases

๐Ÿ“Œ Scrape subreddits (communities) with top posts

๐Ÿ“‹ Scrape Reddit posts with title and text, username, number of comments, votes, media elements

๐Ÿ’ฌ Get Reddit comments, timestamps, points, usernames, post and comment URLs

๐Ÿ‘ฅ Scrape user details, their most recent posts and comments

๐Ÿ” Sort scraped data by categories of Relevance, Hot, Top, and New

๐ŸŒ Scrape data using a specific URL or by keyword

๐Ÿ“ฅ Input Configuration

You can customize the actor using the following input fields:

{
"startUrls": [
{
"url": "https://www.reddit.com/r/pasta/comments/vwi6jx/pasta_peperoni_and_ricotta_cheese_how_to_make/"
}
],
"skipComments": false,
"skipUserPosts": false,
"skipCommunity": false,
"ignoreStartUrls": false,
"searchPosts": true,
"searchComments": false,
"searchCommunities": false,
"searchUsers": false,
"sort": "new",
"includeNSFW": true,
"maxItems": 10,
"maxPostCount": 10,
"maxComments": 10,
"maxCommunitiesCount": 2,
"maxUserCount": 2,
"scrollTimeout": 40,
"subreddit": "news",
"proxy": {
"useApifyProxy": true,
"apifyProxyGroups": [
"RESIDENTIAL"
]
},
"debugMode": false
}

๐Ÿงพ Fields Explained Field Type Description startUrls array URLs of Reddit pages to scrape skipComments boolean Skip scraping comments skipUserPosts boolean Skip scraping user posts skipCommunity boolean Skip scraping community info searchPosts boolean Search for posts searchComments boolean Search for comments searchCommunities boolean Search for communities searchUsers boolean Search for users sort string Sort by Relevance, Hot, Top, New, Comments includeNSFW boolean Include NSFW content maxItems integer Maximum items to save maxPostCount integer Maximum posts per page maxComments integer Maximum comments per page maxCommunitiesCount integer Maximum community pages maxUserCount integer Maximum user pages scrollTimeout integer Page scroll timeout in seconds subreddit string Name of subreddit to scrape (without r/ prefix) proxy object Proxy configuration debugMode boolean Enable debug logs ๐Ÿ“ค Output

The actor returns a dataset containing Reddit posts, comments, communities, and user data.

๐Ÿงฉ Sample Outputs

๐Ÿ“ Reddit Post

{
"id": "t3_144w7sn",
"parsedId": "144w7sn",
"url": "https://www.reddit.com/r/HonkaiStarRail/comments/144w7sn/my_luckiest_10x_pull_yet/",
"username": "YourKingLives",
"title": "My Luckiest 10x Pull Yet",
"communityName": "r/HonkaiStarRail",
"parsedCommunityName": "HonkaiStarRail",
"body": "URL: https://i.redd.it/yod3okjkgx4b1.jpg\nThumbnail: https://b.thumbs.redditmedia.com/lm9KxS4laQWgx4uOoioM3N7-tBK3GLPrxb9da2hGtjs.jpg\nImages:\n\thttps://preview.redd.it/yod3okjkgx4b1.jpg?auto=webp&v=enabled&s=be5faf0250e19138b82c7bbe5e7406fa46da4e73\n",
"html": null,
"numberOfComments": 0,
"upVotes": 1,
"isVideo": false,
"isAd": false,
"over18": false,
"createdAt": "2023-06-09T05:23:15.000Z",
"scrapedAt": "2023-06-09T05:23:28.409Z",
"dataType": "post"
}

๐Ÿ’ฌ Reddit Comment

{
"id": "t1_jnhqrgg",
"parsedId": "jnhqrgg",
"url": "https://www.reddit.com/r/NewsWithJingjing/comments/144v5c3/theres_no_flag_large_enough/jnhqrgg/",
"parentId": "t3_144v5c3",
"username": "smokecat20",
"category": "NewsWithJingjing",
"communityName": "r/NewsWithJingjing",
"body": "A true patriot.",
"createdAt": "2023-06-09T05:00:00.000Z",
"scrapedAt": "2023-06-09T05:23:32.025Z",
"upVotes": 3,
"numberOfreplies": 0,
"html": "<div class=\"md\"><p>A true patriot.</p>\n</div>",
"dataType": "comment"
}

๐Ÿ‘ฅ Reddit Community

{
"id": "2qlhq",
"name": "t5_2qlhq",
"title": "Pizza",
"headerImage": "https://b.thumbs.redditmedia.com/jq9ytPEOecwd5bmGIvNQzjTPE9hdd0kB9XGa--wq55A.png",
"description": "The home of pizza on reddit. An educational community devoted to the art of pizza making.",
"over18": false,
"createdAt": "2008-08-26T00:03:48.000Z",
"scrapedAt": "2023-06-09T05:16:55.443Z",
"numberOfMembers": 569724,
"url": "https://www.reddit.com/r/Pizza/",
"dataType": "community"
}

๐Ÿ‘ค Reddit User

{
"id": "c3h2qmv",
"url": "https://www.reddit.com/user/jancurn/",
"username": "jancurn",
"userIcon": "https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png",
"postKarma": 4,
"commentKarma": 10,
"description": "",
"over18": false,
"createdAt": "2018-09-10T15:13:39.000Z",
"scrapedAt": "2023-06-09T05:21:14.409Z",
"dataType": "user"
}

๐Ÿ”’ Proxy Configuration

This actor uses Apify Proxy automatically to:

Avoid IP-based rate limiting or bans

Access Reddit data reliably

Ensure stable scraping at scale

Default proxy settings use:

{
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}

๐Ÿš€ How to Use

Open the actor in Apify Console

Click "Try actor" or create a new task

Enter your desired Reddit URLs or search terms

Configure scraping options

Run the actor

Download your Reddit data in JSON, CSV, or Excel format

โš™๏ธ Advanced Input Example

{
"startUrls": [
{
"url": "https://www.reddit.com/r/worldnews/"
}
],
"skipComments": false,
"searchPosts": true,
"sort": "hot",
"includeNSFW": false,
"maxItems": 100,
"maxComments": 50,
"proxy": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
},
"debugMode": false
}

๐Ÿ› ๏ธ Tech Stack

๐Ÿงฉ Apify SDK โ€” for actor and data handling

๐Ÿ•ท๏ธ Crawlee โ€” for robust crawling and scraping

๐ŸŒ Puppeteer โ€” for browser automation and rendering dynamic content

โš™๏ธ Node.js โ€” fast, scalable backend environment