Skool classroom Content scraper
Pricing
$15.00/month + usage
Go to Apify Store

Skool classroom Content scraper
Skool Classroom Content Scraper A powerful web scraper designed to extract comprehensive content from Skool Classroom courses. This scraper enables automated collection of course materials, modules, videos, and content from multiple classrooms in a single run.
Pricing
$15.00/month + usage
Rating
1.0
(1)
Developer

Shaheer khan
Maintained by Community
Actor stats
0
Bookmarked
3
Total users
3
Monthly active users
23 days ago
Last modified
Categories
Share
Skool.com Classroom Scraper
This Apify actor scrapes content from Skool.com classrooms, extracting course modules, text content, video links, and course structure.
Features
- Enhanced Scraping: Extracts complete course structure from page data
- Regular Scraping: Falls back to DOM-based scraping if enhanced method fails
- Smart Section Expansion: Automatically expands collapsed course sections
- Content Extraction: Scrapes text content, video links, images, and metadata
- Duplicate Handling: Removes duplicate modules automatically
- Error Recovery: Continues scraping even if individual modules fail
Input Parameters
Required
- Email: Your Skool.com account email
- Password: Your Skool.com account password
- Classroom URL: Full URL of the classroom to scrape
Optional
- Use Enhanced Scraping: Enable enhanced scraping method (default: true)
- Max Concurrency: Maximum concurrent requests (1-10, default: 1)
- Delay Between Requests: Delay in milliseconds between requests (default: 2000ms)
Output
The actor outputs scraped data in the following format:
Enhanced Scraping Output
{"type": "enhanced","totalSections": 5,"totalModules": 25,"data": [{"courseTitle": "Introduction to Marketing","moduleTitle": "Getting Started","videoLink": "https://example.com/video","content": "Module text content...","scrapedAt": "2024-01-01T12:00:00.000Z"}],"rawStructure": { /* Complete course structure */ }}
Regular Scraping Output
{"type": "regular","totalModules": 25,"data": [{"tabTitle": "Module 1: Getting Started","videoTitle": "Introduction Video","videoUrl": "https://example.com/video","videoDuration": "15:30","textContent": "Module content...","paragraphs": ["Paragraph 1", "Paragraph 2"],"images": [{"src": "image.jpg", "alt": "Description"}],"links": [{"href": "link.com", "text": "Link Text"}],"scrapedAt": "2024-01-01T12:00:00.000Z"}]}
How It Works
- Authentication: Logs into Skool.com using provided credentials
- Navigation: Navigates to the specified classroom URL
- Section Analysis: Analyzes dropdown arrows to identify collapsed sections
- Smart Expansion: Expands only collapsed sections (avoids unnecessary clicks)
- Module Discovery: Finds all module links after expansion
- Content Scraping: Visits each module and extracts content
- Data Processing: Structures and deduplicates the scraped data
Scraping Methods
Enhanced Scraping (Recommended)
- Extracts course structure from page's
__NEXT_DATA__ - Matches scraped content with structured course data
- Provides better organization and metadata
- More reliable for complex course structures
Regular Scraping (Fallback)
- DOM-based scraping approach
- Extracts content directly from page elements
- Used when enhanced method fails
- Still provides comprehensive content extraction
Best Practices
- Rate Limiting: Use appropriate delays to avoid being blocked
- Single Concurrency: Keep max concurrency at 1 for stability
- Valid Credentials: Ensure your Skool.com account has access to the classroom
- Full URLs: Use complete classroom URLs including all parameters
Troubleshooting
Common Issues
- Login Failed: Check email/password and account status
- No Content Found: Verify classroom URL and access permissions
- Timeout Errors: Increase timeout settings or reduce concurrency
Error Handling
The actor includes comprehensive error handling:
- Continues scraping if individual modules fail
- Falls back to regular scraping if enhanced method fails
- Logs detailed error information for debugging
Technical Details
- Runtime: Node.js with Puppeteer
- Memory: 4GB recommended for large classrooms
- Timeout: 1 hour default (adjust based on classroom size)
- Dependencies: Apify SDK, Puppeteer, XLSX
Privacy & Security
- Credentials are handled securely using Apify's secret input fields
- No credentials or sensitive data are logged or stored
- All scraping respects Skool.com's structure and rate limits
Support
For issues or questions:
- Check the actor logs for detailed error information
- Verify input parameters are correct
- Ensure your Skool.com account has proper access
- Contact support with specific error messages
Changelog
v1.0.0
- Initial release
- Enhanced and regular scraping methods
- Smart section expansion
- Comprehensive content extraction
- Error recovery and fallback mechanisms