Skool Community Posts and Classroom Courses Scraper avatar

Skool Community Posts and Classroom Courses Scraper

Try for free

Pay $8.00 for 1,000 results

View all Actors
Skool Community Posts and Classroom Courses Scraper

Skool Community Posts and Classroom Courses Scraper

memo23/skool-posts-with-comments-scraper
Try for free

Pay $8.00 for 1,000 results

Dive deep into Skool.com's community discussions with our comprehensive scraper. Unearth valuable insights from posts and nested comments, empowering your content strategy and community engagement analysis. From user interactions to trending topics, capture it all with unparalleled precision.

Unlock valuable insights from Skool.com discussions. Extract comprehensive post data, classroom courses, and nested comments with ease, empowering your community analysis and content strategy.

Overview

This actor allows you to scrape posts, classroom courses, and their associated comments from Skool.com. It extracts detailed information about each post and course, including content, metadata, user information, and nested comments, providing a complete picture of discussions and learning materials within Skool communities.

Features

  • Detailed Post Information: Extract comprehensive data about each post, including title, content, and metadata.
  • Classroom Courses: Scrape content from classroom courses, including lessons and related details.
  • User Details: Retrieve information about post authors, commenters, and course creators.
  • Nested Comments: Capture full comment threads, including replies to comments.
  • Customizable Depth: Option to include or exclude comments on each post or course.
  • Flexible Input: Support for specific Skool.com community URLs.
  • Engagement Metrics: Capture upvotes, comment counts, and other relevant statistics.
  • Proxy Support: Built-in proxy configuration to enhance scraping reliability and avoid blocks.

How to Use

  1. Set Up: Ensure you have an Apify account and access to the Apify platform.
  2. Install a cookie management extension like:
  3. Login to your Skool account.
  4. While on the Skool tab, click the extension to export the cookies.
  5. Paste the cookies into this actor's Cookie input field (delete everything before pasting).
  6. Specify the Skool group URL you want to scrape. Input URLs in the format https://www.skool.com/{group-name} (see Input Configuration section). Add multiple URLs for broader scraping scope.
  7. Select Tab: Choose whether to scrape the Community or Classroom tab by setting the tab option.
  8. Configure Member Details: Choose whether to include comments for each group by setting the includeComments option.
  9. (Optional) Customize Settings: Adjust scraper settings, including the max number of listings to scrape, maximum concurrency, minimum concurrency, max request retries, and any specific data fields to collect.
  10. (Optional) Bypass Site Protection: To bypass site protection, use residential proxies from the country you're scraping from. This approach helps mimic regular user behavior, minimizing detection and blocking risks.
  11. Run the Scraper: Launch the scraper on the Apify platform. Monitor its progress and adjust settings as needed.
  12. Data Collection: Extracted data is available in your preferred format, with support for JSON, HTML, CSV, Excel, and other formats provided by Apify.

Input Configuration

Here's an example of how to set up the input for the Skool Scraper:

1{
2    "startUrls": [
3        {
4            "url": "https://www.skool.com/ai-automation-mastery"
5        }
6    ],
7    "tab": "classroom",
8    "includeComments": true,
9    "maxItems": 1000,
10    "maxConcurrency": 100,
11    "minConcurrency": 1,
12    "maxRequestRetries": 30
13}

Input Fields Explanation

  • startUrls: Array of Skool.com community URLs to scrape posts, courses, and comments from.
  • tab: Specifies which tab to scrape (community or classroom).
  • includeComments: Boolean to determine whether to scrape comments for each post or course (default: false).
  • maxItems: Maximum number of items to scrape (default: 1000).
  • maxConcurrency: Maximum number of pages processed simultaneously (default: 100).
  • minConcurrency: Minimum number of pages processed simultaneously (default: 1).
  • maxRequestRetries: Number of retries for failed requests (default: 30).
  • cookies: JSON array of cookies for authentication with Skool.com.
  • proxyConfig: Proxy configuration to specify proxy servers for scraping.

Output Structure

Community Tab Output

The output data includes detailed information about each post and their comments. Here's a sample of the structure:

1{
2    "id": "aab147fa0ea4420d83e8d3a9214f5203",
3    "name": "roadmap-update",
4    "metadata": {
5        "action": 0,
6        "content": "Post content here...",
7        "comments": 37,
8        "upvotes": 50,
9        "title": "Roadmap Update",
10        "pinned": 1,
11        "imagePreview": "",
12        "imagePreviewSmall": "",
13        "videoLinksData": "[]",
14        "contributors": "[{...}]",
15        "labels": "3916973e45d64416917aaba09edff141",
16        "hasNewComments": 1,
17        "lastComment": 1731431342139887000
18    },
19    "createdAt": "2024-11-07T23:26:18.04203Z",
20    "updatedAt": "2024-11-14T09:50:05.802436Z",
21    "groupId": "b575158c8d8240b88e9f13da74aa66cc",
22    "userId": "5222fdb103d340ecaf61d47f35302f52",
23    "postType": "generic",
24    "rootId": "aab147fa0ea4420d83e8d3a9214f5203",
25    "labelId": "3916973e45d64416917aaba09edff141",
26    "user": {
27        "id": "5222fdb103d340ecaf61d47f35302f52",
28        "name": "username",
29        "metadata": {
30            "actStatus": "hardcore",
31            "bio": "User bio...",
32            "pictureBubble": "URL to bubble picture",
33            "pictureProfile": "URL to profile picture",
34            "location": "User location",
35            "linkWebsite": "User website URL",
36            "linkYoutube": "User YouTube URL"
37        },
38        "createdAt": "2020-05-14T00:51:12.09168Z",
39        "updatedAt": "2024-11-14T15:45:27.923959Z",
40        "firstName": "First",
41        "lastName": "Last"
42    },
43    "url": "https://www.skool.com/group-name/post-name",
44    "comments": [
45        {
46            "post": {
47                "id": "comment-id",
48                "metadata": {
49                    "action": 0,
50                    "content": "Comment content...",
51                    "upvotes": 4,
52                    "attachments": "attachment-id",
53                    "attachments_data": "[{...}]"
54                },
55                "created_at": "2024-11-07T23:28:36.995Z",
56                "updated_at": "2024-11-08T00:40:38.94154Z",
57                "user_id": "user-id",
58                "post_type": "comment",
59                "parent_id": "parent-post-id",
60                "root_id": "root-post-id",
61                "user": {
62                    "id": "user-id",
63                    "name": "username",
64                    "metadata": {
65                        "bio": "User bio",
66                        "picture_bubble": "URL to bubble picture",
67                        "picture_profile": "URL to profile picture"
68                    },
69                    "created_at": "Creation timestamp",
70                    "updated_at": "Update timestamp",
71                    "first_name": "First",
72                    "last_name": "Last"
73                }
74            }
75        }
76    ]
77}

Output Fields Explanation

Post Object:

  • id: Unique identifier for the post
  • name: URL-friendly name of the post
  • metadata: Contains various post metadata including:
    • content: The main content of the post
    • comments: Number of comments
    • upvotes: Number of upvotes
    • title: Post title
    • pinned: Whether the post is pinned (1 for yes, 0 for no)
    • imagePreview: URL to preview image if any
    • videoLinksData: Array of video links if any
    • contributors: Array of users who contributed to the post
    • labels: Label identifiers for the post
  • createdAt: Timestamp of post creation
  • updatedAt: Timestamp of last update
  • postType: Type of post (e.g., "generic")
  • user: Detailed information about the post author
  • url: Full URL to the post
  • comments: Array of comment objects

Comment Object:

  • post: Contains the comment data including:
    • id: Unique identifier for the comment
    • metadata: Comment metadata including content and upvotes
    • created_at: Timestamp of comment creation
    • updated_at: Timestamp of last update
    • user: Detailed information about the comment author
    • post_type: Type (always "comment")
    • parent_id: ID of the parent post
    • root_id: ID of the root post

Classroom Tab Output

The output data includes information about each module within a classroom course. Here's a sample structure:

1{
2    "type": "module",
3    "title": "Module Title",
4    "postTitle": "Specific Lecture Title",
5    "content": "Module content and description...",
6    "id": "unique-module-id",
7    "urlAjax": "https://api.skool.com/posts/module-id/comments",
8    "url": "https://www.skool.com/group-name/classroom/module-name",
9    "media": [
10        "https://www.loom.com/share/video-id"
11    ],
12    "courseMetaDetails": {
13        "id": "course-id",
14        "name": "course-name",
15        "title": "Course Title",
16        "createdAt": "2024-09-20T16:51:46.06926Z",
17        "updatedAt": "2024-11-12T23:47:47.319915Z"
18    },
19    "comments": [ ... ] // Same structure as community comments
20}

Classroom Output Fields Explanation

Module Object:

  • type: Type of content (e.g., "module")
  • title: Title of the module or section
  • postTitle: Specific title of the lecture or content
  • content: Module description or content
  • id: Unique identifier for the module
  • urlAjax: API endpoint for loading module comments
  • url: Full URL to access the module
  • media: Array of media URLs (typically video lectures)
  • courseMetaDetails: Course metadata including:
    • id: Course identifier
    • name: Course name/code
    • title: Full course title
    • createdAt: Course creation timestamp
    • updatedAt: Last update timestamp
  • comments: Array of comments following the same structure as community posts
  • Skool Members Scraper: Gather extensive user details such as first and last names, email, and links to social profiles (e.g., Facebook, Instagram, LinkedIn).

Support

Additional Services

Developer
Maintained by Community
Actor metrics
  • 11 monthly users
  • 1 star
  • 80.0% runs succeeded
  • Created in Sep 2024
  • Modified 2 days ago