edX Scraper | University Courses and Programs
Pricing
from $19.00 / 1,000 results
edX Scraper | University Courses and Programs
Extract edX course catalog data including title, university, instructors, level, duration, price, language, subject, prerequisites, and full description. Track MicroMasters, professional certificates, and degree programs for education analytics, lead generation, and market research.
Pricing
from $19.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share

🎓 edX Scraper
🚀 Export edX course data in seconds. Search by keyword or subject and collect course titles, partners, prices, durations, enrollment counts, and direct links in one clean dataset.
🕒 Last updated: 2026-05-22 · 📊 13 fields per record · Up to 1,000,000 courses · Global course catalog coverage
The edX Scraper extracts course and program listings from edX's public course catalog using Algolia's search index - the same fast search engine edX uses for its own website. Each record includes the course title, subtitle, offering partner (MIT, Harvard, Microsoft, Google, etc.), subject category, difficulty level, language, duration, price, recent enrollment count, course type, and a direct link to the course page.
edX hosts thousands of courses and professional programs from over 160 partner universities and institutions worldwide, covering subjects from data science and machine learning to business, humanities, and language learning. This scraper gives you structured access to that entire catalog in one run.
Target Audience
| Who | Why |
|---|---|
| EdTech researchers | Analyze course catalog composition, pricing, and partner distribution |
| Curriculum developers | Benchmark course structures and durations across providers |
| HR and L&D teams | Discover upskilling programs for specific skills and subjects |
| Competitive intelligence teams | Track which skills edX partners are prioritizing |
| Content creators and educators | Research what already exists before building new courses |
📋 What the edX Scraper does
- Queries edX's Algolia search index by keyword and optional subject filter
- Returns courses, programs, executive education, and degree products
- Captures partner name and logo, pricing, enrollment count, language, and duration
- Formats duration as a human-readable string (e.g. "6 weeks (5-8 hrs/wk)")
- Strips HTML from descriptions to return clean plain text
- Paginates automatically until
maxItemsis reached or all results are exhausted
💡 Why it matters: edX's course catalog spans thousands of offerings across 160+ partner institutions - far too many to browse manually. This actor lets researchers, product teams, and L&D professionals access the full catalog in minutes, with structured data ready for analysis or import.
🎬 Full Demo
🚧 Coming soon
⚙️ Input
| Field | Type | Default | Description |
|---|---|---|---|
query | string | python | Search keyword (e.g. "machine learning", "data science", "project management") |
subject | string | (none) | Subject filter (e.g. "Computer Science", "Business & Management") |
maxItems | integer | 10 | Maximum number of courses to collect (free: 10, paid: up to 1,000,000) |
Example 1 - Search for data science courses:
{"query": "data science","maxItems": 100}
Example 2 - Browse computer science courses:
{"query": "","subject": "Computer Science","maxItems": 200}
⚠️ Good to Know: The
subjectfield must match an edX subject category exactly (e.g. "Computer Science", "Business & Management", "Data Analysis & Statistics"). Free users are limited to 10 courses per run. All course types are included: individual courses, programs, executive education, and degrees.
📊 Output
| Field | Type | Description |
|---|---|---|
🖼️ imageUrl | string | Course cover image URL |
📝 title | string | Full course or program name |
📄 subtitle | string | Short course description (HTML-stripped) |
🏛️ partner | string | Offering institution (e.g. MIT, Harvard, Google) |
🖼️ partnerLogo | string | Partner institution logo URL |
🗂️ subject | string | Subject category |
📊 level | string | Difficulty level (Introductory, Intermediate, Advanced) |
🌐 language | string | Language of instruction |
⏱️ duration | string | Course length (e.g. "6 weeks (5-8 hrs/wk)") |
💰 price | string | Price or "Free" if no cost |
👥 enrollmentCount | number | Recent enrollment count |
🏷️ courseType | string | Product type (Course, Program, Executive Education, 2U Degree) |
🔗 url | string | Direct link to the edX course page |
🕒 scrapedAt | string | ISO timestamp of when the record was collected |
Sample record:
{"imageUrl": "https://prod-discovery.edx-cdn.org/media/programs/card_images/abc123.jpg","title": "Python for Data Science","subtitle": "Learn Python programming fundamentals and apply them to real-world data analysis problems.","partner": "MIT","partnerLogo": "https://prod-discovery.edx-cdn.org/organization/logos/mit.png","subject": "Computer Science","level": "Introductory","language": "English","duration": "4 weeks (5-7 hrs/wk)","price": "USD 149","enrollmentCount": 48200,"courseType": "Course","url": "https://www.edx.org/learn/python/massachusetts-institute-of-technology-python-for-data-science","scrapedAt": "2026-05-22T09:15:00.000Z"}
✨ Why choose this Actor
- Algolia-powered - uses edX's own fast search index, not brittle HTML scraping
- All product types - courses, programs, executive education, and degree programs in one dataset
- Human-readable duration - combines weeks and hours/week into a single readable string
- HTML-stripped descriptions - plain text subtitles, no markup to clean up
- Partner logo included - institution logos for display in apps or reports
- Pay-per-item pricing - only pay for the course records you collect
📈 How it compares to alternatives
| Feature | ParseForge edX Scraper | Manual catalog browsing | Coursera scraper |
|---|---|---|---|
| Full catalog export | Yes | No | Different platform |
| Enrollment counts | Yes | Sometimes | Sometimes |
| Partner logos | Yes | Yes | Varies |
| Duration data | Yes | Yes | Varies |
| Price extraction | Yes | Yes | Yes |
| Free tier | 10 courses | Unlimited (slow) | N/A |
🚀 How to use
- Create a free Apify account (includes $5 credit)
- Open the edX Scraper actor page and click Try for free
- Enter a search keyword (e.g.
machine learning,finance,UX design) - Optionally add a subject filter
- Set
maxItemsto the number of courses you want - Click Start - results are typically ready in under 30 seconds
- Download as CSV, JSON, Excel, or connect via API
💼 Business use cases
Learning and Development (L&D) Planning
HR and L&D teams can pull all courses in a given subject area, compare partner quality and pricing, and build a curated list of upskilling resources for employees.
EdTech Market Research
EdTech companies and investors can analyze the full edX catalog to understand market saturation, pricing norms, popular topics, and which institutions are most active.
Curriculum Gap Analysis
Educators building new courses can scan existing edX offerings in their subject to identify what already exists and where gaps remain in the curriculum landscape.
Enrollment Trend Analysis
The enrollmentCount field reveals which topics are most in-demand. Researchers and product teams can track which subjects are growing in popularity over time by running periodic scans.
🔌 Automating edX Scraper
- Make (formerly Integromat) - Schedule monthly catalog scans and update an Airtable course database automatically
- Zapier - Notify your team when new courses matching your keyword are added to the catalog
- Google Sheets - Export course lists and share with your L&D team for review and curation
- REST API - Embed course search into internal learning portals or recommendation tools
🌟 Beyond business use cases
Personal Learning Discovery
Students and self-learners can scan the full edX catalog for a topic they want to master, comparing multiple courses from different institutions side by side before enrolling.
Academic Research on Online Education
Researchers studying MOOCs and the online learning landscape can collect systematic data on course offerings, pricing models, and institutional participation.
Journalism and Reporting
Education journalists can track how the edX catalog evolves over time - which subjects are growing, which partners are expanding, and how pricing changes.
Open Data Projects
Developers and data enthusiasts can contribute edX course data to open educational resource directories or build recommendation engines.
🤖 Ask an AI assistant about this scraper
Not sure how to work with your results? Ask an AI:
"I have JSON data from the edX Scraper with fields like title, partner, subject, level, price, and enrollmentCount. How do I identify the 10 most popular free courses and group them by subject?"
The structured output is designed to be immediately usable with spreadsheets, databases, or AI-powered analysis tools.
❓ Frequently Asked Questions
Is this scraper affiliated with edX or 2U? No. This is an independent tool that accesses edX's public Algolia search index - the same index powering edX's own website search.
What course types are included? All edX product types: individual Courses, Programs (MicroMasters, MicroBachelors, Professional Certificate), Executive Education, and 2U Degrees.
What does price: "Free" mean?
The course is available to audit for free. Some courses offer a paid verified certificate option but can be taken without charge.
What does enrollmentCount represent?
The recent enrollment count as reported by edX's Algolia index. This is not a total all-time enrollment - it reflects recent activity.
Can I filter by language?
Not at the input level. The language field in the output lets you filter after export.
How current is the data? Data is fetched live from Algolia at run time. It reflects the edX catalog as it was at that moment.
Can I collect all courses on edX?
Yes - leave query empty and set a large maxItems. The catalog has several thousand courses and programs.
What is the subject filter?
It maps to edX's subject taxonomy. Examples: "Computer Science", "Data Analysis & Statistics", "Business & Management", "Language", "Engineering".
Does it include course syllabi or module breakdowns? No. The scraper extracts catalog-level metadata. For detailed course content, visit the course URL directly.
How many courses does edX have? edX hosts several thousand courses and programs from 160+ partner institutions.
Does it capture discounts or promotional pricing?
The price field reflects the list price from the catalog. Promotional pricing is not captured.
Can I run this on a schedule? Yes. Use Apify's built-in scheduler to run monthly or quarterly catalog snapshots.
🔌 Integrate with any app
Export your dataset to:
Spreadsheets: Google Sheets, Microsoft Excel, Airtable
Databases: PostgreSQL, MySQL, MongoDB, Supabase
LMS and HR Tools: Workday Learning, Cornerstone, Degreed
Automation: Make, Zapier, n8n, Pipedream
Analytics: Tableau, Power BI, Metabase, Google Looker Studio
🔗 Recommended Actors
| Actor | Description |
|---|---|
| Coursera Scraper | Extract course listings and reviews from Coursera |
| Udemy Scraper | Scrape Udemy courses with pricing and ratings |
| LinkedIn Learning Scraper | Collect professional courses from LinkedIn Learning |
💡 Pro Tip: browse the complete ParseForge collection for 50+ ready-to-use data extractors covering EdTech platforms, job boards, marketplaces, and more.
🆘 Need Help? Open our contact form and we will get back to you within one business day.
⚠️ Disclaimer: This actor is an independent tool not affiliated with, endorsed by, or connected to edX Inc. or 2U Inc. It accesses only publicly available course catalog data. Use responsibly and in accordance with edX's Terms of Service. ParseForge is not responsible for how collected data is used.