edX Scraper | University Courses and Programs avatar

edX Scraper | University Courses and Programs

Pricing

from $19.00 / 1,000 results

Go to Apify Store
edX Scraper | University Courses and Programs

edX Scraper | University Courses and Programs

Extract edX course catalog data including title, university, instructors, level, duration, price, language, subject, prerequisites, and full description. Track MicroMasters, professional certificates, and degree programs for education analytics, lead generation, and market research.

Pricing

from $19.00 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

ParseForge Banner

🎓 edX Scraper

🚀 Export edX course data in seconds. Search by keyword or subject and collect course titles, partners, prices, durations, enrollment counts, and direct links in one clean dataset.

🕒 Last updated: 2026-05-22 · 📊 13 fields per record · Up to 1,000,000 courses · Global course catalog coverage

The edX Scraper extracts course and program listings from edX's public course catalog using Algolia's search index - the same fast search engine edX uses for its own website. Each record includes the course title, subtitle, offering partner (MIT, Harvard, Microsoft, Google, etc.), subject category, difficulty level, language, duration, price, recent enrollment count, course type, and a direct link to the course page.

edX hosts thousands of courses and professional programs from over 160 partner universities and institutions worldwide, covering subjects from data science and machine learning to business, humanities, and language learning. This scraper gives you structured access to that entire catalog in one run.

Target Audience

WhoWhy
EdTech researchersAnalyze course catalog composition, pricing, and partner distribution
Curriculum developersBenchmark course structures and durations across providers
HR and L&D teamsDiscover upskilling programs for specific skills and subjects
Competitive intelligence teamsTrack which skills edX partners are prioritizing
Content creators and educatorsResearch what already exists before building new courses

📋 What the edX Scraper does

  • Queries edX's Algolia search index by keyword and optional subject filter
  • Returns courses, programs, executive education, and degree products
  • Captures partner name and logo, pricing, enrollment count, language, and duration
  • Formats duration as a human-readable string (e.g. "6 weeks (5-8 hrs/wk)")
  • Strips HTML from descriptions to return clean plain text
  • Paginates automatically until maxItems is reached or all results are exhausted

💡 Why it matters: edX's course catalog spans thousands of offerings across 160+ partner institutions - far too many to browse manually. This actor lets researchers, product teams, and L&D professionals access the full catalog in minutes, with structured data ready for analysis or import.

🎬 Full Demo

🚧 Coming soon

⚙️ Input

FieldTypeDefaultDescription
querystringpythonSearch keyword (e.g. "machine learning", "data science", "project management")
subjectstring(none)Subject filter (e.g. "Computer Science", "Business & Management")
maxItemsinteger10Maximum number of courses to collect (free: 10, paid: up to 1,000,000)

Example 1 - Search for data science courses:

{
"query": "data science",
"maxItems": 100
}

Example 2 - Browse computer science courses:

{
"query": "",
"subject": "Computer Science",
"maxItems": 200
}

⚠️ Good to Know: The subject field must match an edX subject category exactly (e.g. "Computer Science", "Business & Management", "Data Analysis & Statistics"). Free users are limited to 10 courses per run. All course types are included: individual courses, programs, executive education, and degrees.

📊 Output

FieldTypeDescription
🖼️ imageUrlstringCourse cover image URL
📝 titlestringFull course or program name
📄 subtitlestringShort course description (HTML-stripped)
🏛️ partnerstringOffering institution (e.g. MIT, Harvard, Google)
🖼️ partnerLogostringPartner institution logo URL
🗂️ subjectstringSubject category
📊 levelstringDifficulty level (Introductory, Intermediate, Advanced)
🌐 languagestringLanguage of instruction
⏱️ durationstringCourse length (e.g. "6 weeks (5-8 hrs/wk)")
💰 pricestringPrice or "Free" if no cost
👥 enrollmentCountnumberRecent enrollment count
🏷️ courseTypestringProduct type (Course, Program, Executive Education, 2U Degree)
🔗 urlstringDirect link to the edX course page
🕒 scrapedAtstringISO timestamp of when the record was collected

Sample record:

{
"imageUrl": "https://prod-discovery.edx-cdn.org/media/programs/card_images/abc123.jpg",
"title": "Python for Data Science",
"subtitle": "Learn Python programming fundamentals and apply them to real-world data analysis problems.",
"partner": "MIT",
"partnerLogo": "https://prod-discovery.edx-cdn.org/organization/logos/mit.png",
"subject": "Computer Science",
"level": "Introductory",
"language": "English",
"duration": "4 weeks (5-7 hrs/wk)",
"price": "USD 149",
"enrollmentCount": 48200,
"courseType": "Course",
"url": "https://www.edx.org/learn/python/massachusetts-institute-of-technology-python-for-data-science",
"scrapedAt": "2026-05-22T09:15:00.000Z"
}

✨ Why choose this Actor

  • Algolia-powered - uses edX's own fast search index, not brittle HTML scraping
  • All product types - courses, programs, executive education, and degree programs in one dataset
  • Human-readable duration - combines weeks and hours/week into a single readable string
  • HTML-stripped descriptions - plain text subtitles, no markup to clean up
  • Partner logo included - institution logos for display in apps or reports
  • Pay-per-item pricing - only pay for the course records you collect

📈 How it compares to alternatives

FeatureParseForge edX ScraperManual catalog browsingCoursera scraper
Full catalog exportYesNoDifferent platform
Enrollment countsYesSometimesSometimes
Partner logosYesYesVaries
Duration dataYesYesVaries
Price extractionYesYesYes
Free tier10 coursesUnlimited (slow)N/A

🚀 How to use

  1. Create a free Apify account (includes $5 credit)
  2. Open the edX Scraper actor page and click Try for free
  3. Enter a search keyword (e.g. machine learning, finance, UX design)
  4. Optionally add a subject filter
  5. Set maxItems to the number of courses you want
  6. Click Start - results are typically ready in under 30 seconds
  7. Download as CSV, JSON, Excel, or connect via API

💼 Business use cases

Learning and Development (L&D) Planning

HR and L&D teams can pull all courses in a given subject area, compare partner quality and pricing, and build a curated list of upskilling resources for employees.

EdTech Market Research

EdTech companies and investors can analyze the full edX catalog to understand market saturation, pricing norms, popular topics, and which institutions are most active.

Curriculum Gap Analysis

Educators building new courses can scan existing edX offerings in their subject to identify what already exists and where gaps remain in the curriculum landscape.

Enrollment Trend Analysis

The enrollmentCount field reveals which topics are most in-demand. Researchers and product teams can track which subjects are growing in popularity over time by running periodic scans.

🔌 Automating edX Scraper

  • Make (formerly Integromat) - Schedule monthly catalog scans and update an Airtable course database automatically
  • Zapier - Notify your team when new courses matching your keyword are added to the catalog
  • Google Sheets - Export course lists and share with your L&D team for review and curation
  • REST API - Embed course search into internal learning portals or recommendation tools

🌟 Beyond business use cases

Personal Learning Discovery

Students and self-learners can scan the full edX catalog for a topic they want to master, comparing multiple courses from different institutions side by side before enrolling.

Academic Research on Online Education

Researchers studying MOOCs and the online learning landscape can collect systematic data on course offerings, pricing models, and institutional participation.

Journalism and Reporting

Education journalists can track how the edX catalog evolves over time - which subjects are growing, which partners are expanding, and how pricing changes.

Open Data Projects

Developers and data enthusiasts can contribute edX course data to open educational resource directories or build recommendation engines.

🤖 Ask an AI assistant about this scraper

Not sure how to work with your results? Ask an AI:

"I have JSON data from the edX Scraper with fields like title, partner, subject, level, price, and enrollmentCount. How do I identify the 10 most popular free courses and group them by subject?"

The structured output is designed to be immediately usable with spreadsheets, databases, or AI-powered analysis tools.

❓ Frequently Asked Questions

Is this scraper affiliated with edX or 2U? No. This is an independent tool that accesses edX's public Algolia search index - the same index powering edX's own website search.

What course types are included? All edX product types: individual Courses, Programs (MicroMasters, MicroBachelors, Professional Certificate), Executive Education, and 2U Degrees.

What does price: "Free" mean? The course is available to audit for free. Some courses offer a paid verified certificate option but can be taken without charge.

What does enrollmentCount represent? The recent enrollment count as reported by edX's Algolia index. This is not a total all-time enrollment - it reflects recent activity.

Can I filter by language? Not at the input level. The language field in the output lets you filter after export.

How current is the data? Data is fetched live from Algolia at run time. It reflects the edX catalog as it was at that moment.

Can I collect all courses on edX? Yes - leave query empty and set a large maxItems. The catalog has several thousand courses and programs.

What is the subject filter? It maps to edX's subject taxonomy. Examples: "Computer Science", "Data Analysis & Statistics", "Business & Management", "Language", "Engineering".

Does it include course syllabi or module breakdowns? No. The scraper extracts catalog-level metadata. For detailed course content, visit the course URL directly.

How many courses does edX have? edX hosts several thousand courses and programs from 160+ partner institutions.

Does it capture discounts or promotional pricing? The price field reflects the list price from the catalog. Promotional pricing is not captured.

Can I run this on a schedule? Yes. Use Apify's built-in scheduler to run monthly or quarterly catalog snapshots.

🔌 Integrate with any app

Export your dataset to:

Spreadsheets: Google Sheets, Microsoft Excel, Airtable

Databases: PostgreSQL, MySQL, MongoDB, Supabase

LMS and HR Tools: Workday Learning, Cornerstone, Degreed

Automation: Make, Zapier, n8n, Pipedream

Analytics: Tableau, Power BI, Metabase, Google Looker Studio

ActorDescription
Coursera ScraperExtract course listings and reviews from Coursera
Udemy ScraperScrape Udemy courses with pricing and ratings
LinkedIn Learning ScraperCollect professional courses from LinkedIn Learning

💡 Pro Tip: browse the complete ParseForge collection for 50+ ready-to-use data extractors covering EdTech platforms, job boards, marketplaces, and more.


🆘 Need Help? Open our contact form and we will get back to you within one business day.


⚠️ Disclaimer: This actor is an independent tool not affiliated with, endorsed by, or connected to edX Inc. or 2U Inc. It accesses only publicly available course catalog data. Use responsibly and in accordance with edX's Terms of Service. ParseForge is not responsible for how collected data is used.