Deprecated

Pricing

from $1.00 / 1,000 results

See alternative Actors

Go to Apify Store

Dcinside Scraper

Deprecated

See alternative Actors

Scrapes DCInside mgallery boards and outputs one dataset item per post including post metadata, full text, and a structured list of comments + replies (plus a commentsText array for easy viewing).

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Rafaz

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

DCInside Gallery Scraper

An Apify Actor that scrapes posts and comments from DCInside mgallery boards using CheerioCrawler.

Overview

This Actor scrapes DCInside (디시인사이드) mgallery boards, extracting:

Post metadata (title, author, date)
Post content (clean plain-text + optional HTML)
All comments and replies
Structured JSON output to dataset

The scraper uses CheerioCrawler for fast HTTP-based scraping (no browser required) and fetches comments via the mobile AJAX API endpoint.

Quick Start

Install dependencies:

$npm install

Run the Actor locally:

$apify run

Deploy to Apify Platform:

apify login
apify push

Input Parameters

The Actor accepts the following input parameters (defined in .actor/input_schema.json):

Basic Settings

Parameter	Type	Default	Description
`galleryId`	string	`"tomoo"`	The DCInside gallery ID or full URL to scrape (e.g., `"tomoo"` or `"https://gall.dcinside.com/mgallery/board/lists?id=tomoo"`)
`startPage`	integer	`1`	Page number to start scraping from
`endPage`	integer	auto-detect	Page number to stop scraping at. Leave empty to auto-detect last page
`maxPosts`	integer	`0`	Maximum number of posts to scrape (0 = unlimited)

Date Filtering

Parameter	Type	Description
`startDate`	string	Only scrape posts on or after this date (YYYY-MM-DD format)
`endDate`	string	Only scrape posts on or before this date (YYYY-MM-DD format)

Comment Options

Parameter	Type	Default	Description
`includeComments`	boolean	`true`	Whether to fetch comments for each post. Disable for faster scraping
`maxCommentsPerPost`	integer	`0`	Maximum comments per post (0 = unlimited). Useful for posts with thousands of comments

Output & Performance

Parameter	Type	Default	Description
`outputFormat`	string	`"nested"`	Output format: `nested`, `flat`, `minimal`, or `text-only`
`skipExisting`	boolean	`false`	Skip posts already in dataset. Useful for resuming failed runs
`maxRequestsPerCrawl`	integer	`10000`	Maximum HTTP requests (safety limit)

Example Input

{
  "galleryId": "tomoo",
  "startPage": 1,
  "endPage": 10,
  "startDate": "2024-01-01",
  "endDate": "2024-01-31",
  "maxPosts": 100,
  "includeComments": true,
  "maxCommentsPerPost": 500,
  "outputFormat": "nested",
  "skipExisting": false
}

Using Full URLs

You can provide a full gallery URL instead of just the ID:

{
  "galleryId": "https://gall.dcinside.com/mgallery/board/lists?id=tomoo"
}

Or even a specific post URL:

{
  "galleryId": "https://gall.dcinside.com/mgallery/board/view/?id=tomoo&no=123456"
}

The Actor will automatically extract the gallery ID.

Output

The Actor outputs structured JSON objects to the dataset. The exact structure depends on the outputFormat setting:

Nested Format (default)

One object per post with nested comments array:

{
  "galleryId": "tomoo",
  "postNo": "123456",
  "url": "https://gall.dcinside.com/mgallery/board/view/?id=tomoo&no=123456",
  "postTitle": "Post Title",
  "postCreatedAt": "2024.01.15 14:30:25",
  "postAuthor": "nickname",
  "postAuthorNick": "nickname",
  "postAuthorUid": "user123",
  "postAuthorIp": "",
  "postText": "Clean post text (readable plain text)...",
  "postHtml": "<div>Raw HTML inside .write_div...</div>",
  "comments": [
    {
      "commentId": "789",
      "parentCommentId": "",
      "commentAuthor": "commenter1",
      "commentCreatedAt": "01.15 15:00",
      "commentText": "Comment text",
      "commentDepth": 0
    }
  ],
  "commentsText": ["commenter1 (01.15 15:00): Comment text"],
  "commentsCount": 1
}

Flat Format

One row per comment (great for CSV export):

{
  "galleryId": "tomoo",
  "postNo": "123456",
  "postTitle": "Post Title",
  "commentId": "789",
  "commentAuthor": "commenter1",
  "commentText": "Comment text",
  "commentDepth": 0
}

Minimal Format

Posts only, no comments:

{
  "galleryId": "tomoo",
  "postNo": "123456",
  "postTitle": "Post Title",
  "postText": "Post body text..."
}

Text-Only Format

Condensed text format:

{
  "galleryId": "tomoo",
  "postNo": "123456",
  "postTitle": "Post Title",
  "postText": "Post body text...",
  "allCommentsText": "commenter1: Comment text\n↳ commenter2: Reply text",
  "commentsCount": 2
}

How It Works

Gallery ID Extraction: The Actor accepts gallery IDs or full URLs and extracts the ID automatically
List Page Discovery: Fetches list pages from https://gall.dcinside.com/mgallery/board/lists/ and extracts post URLs
Date Filtering: If date filters are set, posts outside the range are skipped
Deduplication: If skipExisting is enabled, already-scraped posts are skipped
Post Extraction: For each post, it extracts metadata plus post content from the desktop mgallery view page (.write_div)
Comment Fetching: Comments are fetched via the mobile AJAX endpoint (https://m.dcinside.com/ajax/response-comment) with pagination support
Data Output: Each post (and optionally comments) is pushed to the dataset in the requested format

Project Structure

.actor/
├── actor.json              # Actor configuration
├── input_schema.json       # Input parameter definitions
├── output_schema.json      # Output schema
└── dataset_schema.json     # Dataset view configuration
src/
└── main.ts                 # Main Actor code
storage/                    # Local storage (development only)
├── datasets/              # Output items
├── key_value_stores/      # INPUT.json and other files
└── request_queues/        # Crawl request queue

Features

✅ Full URL Support: Accept gallery URLs or IDs
✅ Date-Based Filtering: Scrape posts from specific date ranges
✅ Smart Deduplication: Skip existing posts for resume/incremental runs
✅ Flexible Output Formats: Nested, flat, minimal, or text-only
✅ Comment Control: Enable/disable comments, set max per post
✅ Fast HTTP-based scraping (no browser overhead)
✅ Automatic last page detection
✅ Comment pagination support
✅ Structured comment hierarchy (top-level + replies)
✅ Configurable page ranges and post limits
✅ Proxy support via Apify Proxy
✅ Graceful abort handling

Limitations

Image downloading is not included (text and comments only)
CSV export functionality removed (use dataset export instead)
Date filtering requires the post date to be parsable from the page

Tips

Resuming a Failed Run

If a run fails partway through, enable skipExisting to avoid re-scraping posts:

{
  "galleryId": "tomoo",
  "skipExisting": true
}

Scraping Only Recent Posts

Use date filtering instead of page numbers:

{
  "galleryId": "tomoo",
  "startDate": "2024-01-01",
  "endDate": "2024-01-31"
}

Fast Scraping (Posts Only)

Disable comments for much faster scraping:

{
  "galleryId": "tomoo",
  "includeComments": false,
  "outputFormat": "minimal"
}

Handling Posts with Many Comments

Some posts have thousands of comments. Limit them:

{
  "galleryId": "tomoo",
  "maxCommentsPerPost": 100
}

CSV Export

Use outputFormat: "flat" for easier CSV export (one row per comment).

Resources

Changelog

v1.3.0

New: Extract post content more reliably (clean postText) and include postHtml (raw HTML of the post body)
Improved: Friendlier input descriptions/tooltips for non-technical users
Improved: Dataset overview view now includes postText

v1.1.0

New: Accept full gallery URLs (not just IDs)
New: Date-based filtering (startDate, endDate)
New: Skip existing posts (skipExisting)
New: Output format options (nested, flat, minimal, text-only)
New: Comment control (includeComments, maxCommentsPerPost)
Improved input validation and error messages

v1.0.0

Initial release

License

ISC

Instagram post & comments Scraper

logical_scrapers/instagram-post-comments-scraper

The only Instagram scraper that can get you all comments. scrapes full post data including metadata and comments from Instagram posts. It takes Instagram post URLs as input and extracts detailed post information with support for authentication.

Goldmine

4.8

Threads Post Scraper

trantus/threads-post-scraper

Scrape any public Threads post, including full media, text, mentions, links, and all nested comments. Supports multiple URLs, raw JSON mode, and reliable HTML parsing without login. Outputs structured data to Dataset and SUMMARY.json.

Tran Tu

Linkedin Post Comments Scraper (No Cookie)

datadoping/linkedin-post-comments-scraper

For just $1.2 per 1,000 comments. Scrape all LinkedIn post related data including comments, stats, reactions, replies and media attachments Note: If you're on free tier you can only scrape 4 posts and 100 comments per post per run (12 posts in total)

Data Doping

192

4.8

Linkedin Post Comments Scraper

bhansalisoft/linkedin-post-comments-scraper

Linkedin Post Comments Scraper - Easily extract comments from any LinkedIn post with our LinkedIn Post Comments Scraper. Fast, secure, and no coding required.

bhansalisoft

Instagram Post comments Scraper

bhansalisoft/instagram-post-comments-scraper

Instagram Post comments Scraper - scrap all instagram post comments with all details with instagram username and userid. you just need to insert post short code then tool will scrape all post comments related to this post

bhansalisoft

123

1.0

Linkedin Post Comments Scraper

bestscrapers/linkedin-post-comments-scraper

The **Linkedin Post's Comments Scraper** is a powerful tool designed to extract all comments from any LinkedIn profile post — without the need for cookies or login.

Linkedin Scrapers

176

Facebook Page Post Extractor 🤖💾: Data, Details & Analytics

thedoor/facebook-page-scraper

This Apify actor scrapes posts and comments from public Facebook pages. It collects post content, reactions, comments, and attachments, then outputs structured JSON files for analysis.

TheDoor

103

5.0

Instagram Post Scraper (Pay Per Request)

clip-forge/instagram-post-scraper-pay-per-request

Instagram Post Scraper - Grab View data from any post

S K I E D

⚡️Instagram Post & Video Scraper, Video Post Downloader

scrapearchitect/instagram-post-video-scraper-video-post-downloader

📸 Download videos from any Instagram post! 🎬 Get HD videos + thumbnails 📥, extract engagement metrics 📊 (likes, comments), post metadata 🗂️ & advanced technical video specs 🔍 Perfect for archiving & analysis! ⚡️Instagram Post & Video Scraper, Video Post Downloader 📥

Scrape Architect

Post Comments & Engagements Scraper for LinkedIn | No Cookies

apimaestro/linkedin-post-comments-replies-engagements-scraper-no-cookies

Scrape in Batch all LinkedIn post related data including comments, stats, , reactions, replies and media attachments

API Maestro

2.8K

5.0

Dcinside Scraper

DCInside Gallery Scraper

Overview

Quick Start

Input Parameters

Basic Settings

Date Filtering

Comment Options

Output & Performance

Example Input

Using Full URLs

Output

Nested Format (default)

Flat Format

Minimal Format

Text-Only Format

How It Works

Project Structure

Features

Limitations

Tips

Resuming a Failed Run

Scraping Only Recent Posts

Fast Scraping (Posts Only)

Handling Posts with Many Comments

CSV Export

Resources

Changelog

v1.3.0

v1.1.0

v1.0.0

License

You might also like

Instagram post & comments Scraper

Threads Post Scraper

Linkedin Post Comments Scraper (No Cookie)

Linkedin Post Comments Scraper

Instagram Post comments Scraper

Linkedin Post Comments Scraper

Facebook Page Post Extractor 🤖💾: Data, Details & Analytics

Instagram Post Scraper (Pay Per Request)

⚡️Instagram Post & Video Scraper, Video Post Downloader

Post Comments & Engagements Scraper for LinkedIn | No Cookies