Coursera Scraper π
Pricing
from $1.00 / 1,000 results
Coursera Scraper π
Unlock the power of e-learning data! Easily scrape course details, reviews, syllabus, and instructor info from Coursera. Perfect for market research, edtech analysis, and tracking online education trends. Get accurate, structured data to fuel your next big project!
Pricing
from $1.00 / 1,000 results
Rating
5.0
(2)
Developer
Shahid Irfan
Maintained by CommunityActor stats
1
Bookmarked
20
Total users
3
Monthly active users
10 days ago
Last modified
Categories
Share
Coursera Course Scraper
Extract rich Coursera course and program data with search-based collection. Gather titles, ratings, difficulty, provider details, language coverage, badges, and product classification in a clean dataset for education research, catalog monitoring, and competitive analysis.
Features
- Richer Course Records β Collect ratings, review counts, skills, providers, language availability, badges, and catalog classification
- Search-Driven Collection β Use a keyword or a Coursera search URL to target the exact topic you need
- Clean Output β Duplicate records, null-only values, and incomplete items are filtered before they reach the dataset
- Pagination Support β Continue across result pages until you reach your target volume or page cap
- Research-Ready Metadata β Keep search rank, page number, catalog source, and eligibility indicators for downstream analysis
Use Cases
Education Research
Track how subjects, skill clusters, and learning pathways are represented across Coursera. Compare catalog breadth, language support, and difficulty distribution over time.
Program Discovery
Build searchable datasets for internal tools, recommendation systems, or content curation workflows. Use provider names, badges, and program types to segment the catalog quickly.
Market Intelligence
Monitor which topics, institutions, and credential formats are prominent for specific keywords. Spot trends in new programs, subscription inclusion, and translated content availability.
Content Benchmarking
Compare competing programs by ratings, review volume, skills coverage, and duration category. This is useful for analyzing certification landscapes or designing similar learning offers.
Localization Analysis
Measure how widely programs are translated and subtitled. Language coverage fields make it easier to identify global-ready content without manual checking.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | No | β | Search term such as python, machine learning, or data science |
startUrl | String | No | β | Specific Coursera search URL; the actor reads the query value from it |
results_wanted | Integer | No | 20 | Maximum number of records to collect |
max_pages | Integer | No | 10 | Maximum number of result pages to process |
proxyConfiguration | Object | No | β | Optional proxy settings for request routing |
Output Data
Each item in the dataset contains:
| Field | Type | Description |
|---|---|---|
id | String | Coursera product identifier |
name | String | Course or program title |
url | String | Absolute Coursera URL |
imageUrl | String | Course or program image |
avgProductRating | Number | Average rating |
numProductRatings | Integer | Number of ratings |
productDifficultyLevel | String | Difficulty level |
productDuration | String | Duration bucket |
productType | String | Catalog type such as course or certificate |
skills | Array | Skills associated with the item |
partners | Array | Provider or institution names |
partnerLogos | Array | Provider logo URLs |
tagline | String | Subtitle or short pitch |
isCourseFree | Boolean | Whether the item is listed as free |
isCreditEligible | Boolean | Whether the item is credit eligible |
isNewContent | Boolean | Whether the item is marked new |
isPartOfCourseraPlus | Boolean | Whether the item is included in Coursera Plus |
cobrandingEnabled | Boolean | Whether co-branding is enabled |
fullyTranslatedLanguages | Array | Fully translated languages |
subtitlesOnlyLanguages | Array | Subtitle-only languages |
translatedName | String | Localized title when available |
translatedSkills | Array | Localized skills when available |
parentCourseName | String | Parent course name for lesson-level results when present |
parentLessonName | String | Parent lesson name when present |
productCardId | String | Product card identifier |
canonicalType | String | Canonical catalog type |
marketingProductType | String | Marketing-facing product type |
badges | Array | Labels such as Free Trial or NEW |
isPathwayContent | Boolean | Whether the item is part of a pathway |
courseCardRating | Number | Card-level rating when available |
courseCardReviewCount | Integer | Card-level review count when available |
searchQuery | String | Search term used for collection |
searchRank | Integer | Position in the collected result set |
page | Integer | Result page number |
totalElements | Integer | Total items reported for the query |
totalPages | Integer | Total pages reported for the query |
sourceIndexName | String | Source index label from the catalog |
aiSearchSummaryEligible | Boolean | Whether the query is eligible for summary features |
Usage Examples
Basic Search
Collect the first 20 Python-related records:
{"query": "python","results_wanted": 20}
Higher Volume Collection
Collect more records while capping the number of pages:
{"query": "machine learning","results_wanted": 100,"max_pages": 8}
Start From a Coursera Search URL
Use an existing Coursera search URL instead of a separate keyword:
{"startUrl": "https://www.coursera.org/search?query=data%20analytics","results_wanted": 50}
Sample Output
{"id": "s12n~F-h1g0w7EeWeOApO_l5R1w","name": "Python for Everybody","url": "https://www.coursera.org/specializations/python","imageUrl": "https://d2j5ihb19pt1hq.cloudfront.net/sdp_page/s12n_logos/python.jpg","avgProductRating": 4.815413103278494,"numProductRatings": 280281,"productDifficultyLevel": "BEGINNER","productDuration": "THREE_TO_SIX_MONTHS","productType": "SPECIALIZATION","skills": ["Database Design","Debugging","Web Scraping","SQL","JSON","Python Programming"],"partners": ["University of Michigan"],"partnerLogos": ["https://coursera-university-assets.s3.amazonaws.com/70/de505d47be7d3a063b51b6f856a6e2/New-Block-M-Stacked-Blue-295C_600x600.png"],"tagline": "Learn to Program and Analyze Data with Python\nOffered by University of Michigan","isCourseFree": false,"isCreditEligible": false,"isNewContent": false,"isPartOfCourseraPlus": true,"fullyTranslatedLanguages": ["English"],"subtitlesOnlyLanguages": ["Arabic","French","German","Japanese","Spanish"],"productCardId": "F-h1g0w7EeWeOApO_l5R1w","canonicalType": "SPECIALIZATION","marketingProductType": "SPECIALIZATION","badges": ["Free Trial"],"isPathwayContent": true,"searchQuery": "python","searchRank": 1,"page": 1,"totalElements": 605,"totalPages": 31,"sourceIndexName": "consumer_products_cohere_embed_english_v3_cold_start_alias"}
Tips for Best Results
Choose Precise Keywords
- Use focused phrases such as
python for beginnersordata analytics - Add institution or credential terms when you want narrower results
- Run separate searches for different subject clusters to keep datasets clean
Keep Initial Runs Small
- Start with
results_wanted: 20to validate output shape - Increase gradually for larger exports
- Use
max_pagesas a safety cap for broad queries
Review Language Fields
- Check
fullyTranslatedLanguagesandsubtitlesOnlyLanguageswhen analyzing international reach - Use those fields to segment content by localization maturity
Use Proxies When Needed
- Add proxy settings for larger or repeated runs
- Residential routing can help smooth out collection over time
Preserve Search Context
- Keep
searchQuery,searchRank, andpagefor downstream ranking analysis - Combine those fields with ratings and badges for richer comparisons
Integrations
Connect your data with:
- Google Sheets β Review and share search results quickly
- Airtable β Build searchable learning catalogs
- Slack β Notify teams when new keyword pulls finish
- Webhooks β Forward results into custom pipelines
- Make β Automate research and monitoring workflows
- Zapier β Trigger follow-up actions from new datasets
Export Formats
- JSON β Best for structured pipelines
- CSV β Easy spreadsheet analysis
- Excel β Business reporting and review
- XML β Legacy system integration
Frequently Asked Questions
Can I collect both courses and certificate programs?
Yes. Search results can include courses, specializations, professional certificates, and other product types, and each record includes its catalog classification.
What happens if some fields are missing?
The actor drops empty values from each record and skips incomplete items that do not have the minimum required course identity fields.
Does the dataset include duplicate courses?
No. Duplicate records are filtered during collection before they are written to the dataset.
Can I start from a Coursera search link?
Yes. Pass a Coursera search URL in startUrl, and the actor will extract the query value from that link.
How many results can I collect?
You can request large collections, but broad queries may span many pages. Use results_wanted and max_pages together to control run size.
Is language coverage included?
Yes. When Coursera provides it, the output includes fully translated languages and subtitle-only languages.
Can I use the data for ranking analysis?
Yes. Each record includes search position context such as searchRank, page, and total result counts.
Support
For issues or feature requests, contact support through the Apify Console.
Resources
Legal Notice
This actor is intended for lawful data collection and analysis. Users are responsible for ensuring compliance with Coursera terms and all applicable laws before running large-scale collection jobs.