Skool classroom Content scraper avatar
Skool classroom Content scraper

Pricing

$15.00/month + usage

Go to Apify Store
Skool classroom Content scraper

Skool classroom Content scraper

Skool Classroom Content Scraper A powerful web scraper designed to extract comprehensive content from Skool Classroom courses. This scraper enables automated collection of course materials, modules, videos, and content from multiple classrooms in a single run.

Pricing

$15.00/month + usage

Rating

3.2

(2)

Developer

Shaheer khan

Shaheer khan

Maintained by Community

Actor stats

2

Bookmarked

20

Total users

9

Monthly active users

2 hours

Issues response

3 days ago

Last modified

Share

Skool.com Classroom Scraper

This Apify actor scrapes content from Skool.com classrooms, extracting course modules, text content, video links, and course structure.

Features

  • Enhanced Scraping: Extracts complete course structure from page data
  • Regular Scraping: Falls back to DOM-based scraping if enhanced method fails
  • Smart Section Expansion: Automatically expands collapsed course sections
  • Content Extraction: Scrapes text content, video links, images, and metadata
  • Duplicate Handling: Removes duplicate modules automatically
  • Error Recovery: Continues scraping even if individual modules fail

How to Scrape Skool classroom content in bulk

If you want to know more about how Skool Classroom Scraper works, here's a detailed explanation and step-by-step guide with examples. You can also follow this video for guidance:

Input Parameters

Required

  • Email: Your Skool.com account email
  • Password: Your Skool.com account password
  • Classroom URLs: Array of classroom URLs to scrape

Optional

  • Use Enhanced Scraping: Enable enhanced scraping method (default: true)
  • Max Concurrency: Maximum concurrent requests (1-10, default: 1)
  • Delay Between Requests: Delay in milliseconds between requests (default: 2000ms)

Output

The actor outputs scraped data in the following format:

Output Format

{
"type": "direct_id_scraping",
"totalClassrooms": 2,
"totalSections": 10,
"totalModules": 50,
"classrooms": [
{
"url": "https://www.skool.com/classroom/1",
"sections": 5,
"modules": 25
}
],
"data": [
{
"classroomUrl": "https://www.skool.com/classroom/1",
"courseTitle": "Introduction to Marketing",
"moduleTitle": "Getting Started",
"moduleId": "abc123",
"videoLink": "https://vimeo.com/123456789",
"secondaryvideoLink": "https://player.vimeo.com/video/123456789",
"content": "Module text content...",
"scrapedAt": "2024-01-01T12:00:00.000Z"
}
],
"rawStructures": [
{
"url": "https://www.skool.com/classroom/1",
"structure": { /* Complete course structure */ }
}
]
}

How It Works

  1. Authentication: Logs into Skool.com using provided credentials
  2. Multi-Classroom Processing: Processes each classroom URL in sequence
  3. Content Extraction: Uses direct ID-based scraping for reliable content access
  4. Video Link Processing: Extracts and transforms video links (Vimeo, YouTube, etc.)
  5. Data Organization: Structures content by classroom, section, and module
  6. Excel Generation: Creates a multi-tab Excel file with classroom content
  7. Progress Tracking: Maintains state for migration and error recovery

Scraping Methods

  • Extracts course structure from page's __NEXT_DATA__
  • Matches scraped content with structured course data
  • Provides better organization and metadata
  • More reliable for complex course structures

Regular Scraping (Fallback)

  • DOM-based scraping approach
  • Extracts content directly from page elements
  • Used when enhanced method fails
  • Still provides comprehensive content extraction

Best Practices

  1. Rate Limiting: Use appropriate delays to avoid being blocked
  2. Single Concurrency: Keep max concurrency at 1 for stability
  3. Valid Credentials: Ensure your Skool.com account has access to the classroom
  4. Full URLs: Use complete classroom URLs including all parameters

Troubleshooting

Common Issues

  • Login Failed: Check email/password and account status
  • No Content Found: Verify classroom URL and access permissions
  • Timeout Errors: Increase timeout settings or reduce concurrency

Error Handling

The actor includes comprehensive error handling:

  • Continues scraping if individual modules fail
  • Falls back to regular scraping if enhanced method fails
  • Logs detailed error information for debugging

Technical Details

  • Runtime: Node.js with Puppeteer
  • Memory: 4GB recommended for large classrooms
  • Timeout: 1 hour default (adjust based on classroom size)
  • Dependencies: Apify SDK, Puppeteer, XLSX

Privacy & Security

  • Credentials are handled securely using Apify's secret input fields
  • No credentials or sensitive data are logged or stored
  • All scraping respects Skool.com's structure and rate limits

Support

For issues or questions:

  1. Check the actor logs for detailed error information
  2. Verify input parameters are correct
  3. Ensure your Skool.com account has proper access
  4. Contact support with specific error messages

Changelog

v1.0.0

  • Initial release
  • Enhanced and regular scraping methods
  • Smart section expansion
  • Comprehensive content extraction
  • Error recovery and fallback mechanisms