Pricing

$2.99 / 1,000 results

edX Scraper | All In One | $3 / 1k

Scrape edX into clean, structured course and program data. Capture titles, partners, descriptions, skills, level, language, pacing, duration, availability and enrollment signals. Perfect for curriculum research, catalog building, market analysis and competitive tracking.

Pricing

$2.99 / 1,000 results

Rating

0.0

(0)

Developer

Fatih Tahta

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

EDX Scraper | All In One

Slug: fatihtahta/edx-scraper

Overview

EDX Scraper collects structured listings for courses and programs, including titles, URLs, partner names, descriptions, skills, pacing, and availability metadata. It also captures key catalog attributes such as levels, durations, languages, and enrollment signals when available. edX is a leading online learning marketplace, and its catalog data is valuable for curriculum research, market analysis, and enrichment workflows. Runs are fully automated and consistent, reducing manual effort and keeping datasets current.

Why Use This Actor

Market research & analytics: Track catalog growth, skills coverage, partners, and topic trends across courses and programs.
Product & content teams: Identify gaps, benchmark offerings, and validate curriculum direction with real catalog signals.
Developers & data engineering: Feed clean JSON into pipelines for dashboards, search indexes, or internal catalogs.
Lead gen & enrichment: Enrich partner or institution profiles with offerings, levels, and skill coverage.
Monitoring & competitive tracking: Watch new launches, availability shifts, and catalog changes over time.

Input Parameters

Provide any combination of URLs, queries, and filters. Leave optional fields empty to collect broader results.

Parameter	Type	Description	Default
`startUrls`	`string[]`	One or more edX URLs to collect directly. Accepts search results, category pages, and individual course or program pages.	–
`contentType`	`string`	Target content type for search-based discovery. Allowed values: `Courses`, `Programs`, `Executive education courses`, `Degree programs`.	`Courses`
`queries`	`string[]`	Keyword searches to discover listings (e.g., subject, skill, partner name).	–
`limit`	`integer`	Maximum listings to save per query. Use smaller values for sampling or higher values for broader coverage.	`50000`
`proxyConfiguration`	`object`	Optional connection settings for reliability on larger runs.	Apify proxy with `RESIDENTIAL` group

Example Input

{
  "startUrls": [
    "https://www.edx.org/search?q=data%20science",
    "https://www.edx.org/learn/artificial-intelligence"
  ],
  "contentType": "Courses",
  "queries": ["data", "machine learning"],
  "limit": 2000,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Output

6.1 Output destination

The actor writes results to an Apify dataset as JSON records.

6.2 Record envelope (all items)

Each record includes the following stable identifiers:

type (string, required)
id (number, required)
url (string, required)

Recommended idempotency key: type + ":" + id. Use this key for deduplication and upserts when the same entity appears across multiple runs or inputs.

6.3 Examples

Example: course (type = "course")

{
  "type": "course",
  "id": "c2004e8e-3882-4927-a883-1c5f39a28865",
  "url": "https://www.edx.org/learn/data-science/harvard-university-introduction-to-data-science-with-python",
  "title": "Introduction to Data Science with Python",
  "source_url": "https://www.edx.org/search?page=1&q=data&tab=course",
  "seed_type": "query",
  "seed_value": "data",
  "data": {
    "productImageUrl": "https://prod-discovery.edx-cdn.org/cdn-cgi/image/width=auto,height=auto,quality=75,format=webp/media/course/image/c2004e8e-3882-4927-a883-1c5f39a28865-19b5ff5b0248.jpeg",
    "productType": "Course",
    "product": "Course",
    "attributes": [
      "Data Analysis & Statistics",
      "Computer Science"
    ],
    "partnerName": "Harvard University",
    "partnerLogoUrl": "https://prod-discovery.edx-cdn.org/organization/logos/44022f13-20df-4666-9111-cede3e5dc5b6-2cc39992c67a.png",
    "fullDescription": "<p>Every single minute, computers across the world collect millions of gigabytes of data. What can you do to make sense of this mountain of data? How do data scientists use this data for the applications that power our modern world?</p>\n<p>Data science is an ever-evolving field, using algorithms and scientific methods to parse complex data sets. Data scientists use a range of programming languages, such as Python and R, to harness and analyze data. This course focuses on using Python in data science. By the end of the course, you’ll have a fundamental understanding of machine learning models and basic concepts around Machine Learning (ML) and Artificial Intelligence (AI).</p>\n<p>Using Python, learners will study regression models (Linear, Multilinear, and Polynomial) and classification models (kNN, Logistic), utilizing popular libraries such as sklearn, Pandas, matplotlib, and numPy. The course will cover key concepts of machine learning such as: picking the right complexity, preventing overfitting, regularization, assessing uncertainty, weighing trade-offs, and model evaluation. Participation in this course will build your confidence in using Python, preparing you for more advanced study in Machine Learning (ML) and Artificial Intelligence (AI), and advancement in your career.</p>\n<p>Learners must have a minimum baseline of programming knowledge (preferably in Python) and statistics in order to be successful in this course. Python prerequisites can be met with an introductory Python course offered through CS50’s Introduction to Programming with Python, and statistics prerequisites can be met via Fat Chance or with Stat110 offered through HarvardX.</p>",
    "shortDescription": "<p>Learn the concepts and techniques that make up the foundation of data science and machine learning.</p>",
    "productOverview": "<ul>\n<li>Gain hands-on experience and practice using Python to solve real data science challenges</li>\n<li>Practice Python programming and coding for modeling, statistics, and storytelling</li>\n<li>Utilize popular libraries such as Pandas, numPy, matplotlib, and SKLearn</li>\n<li>Run basic machine learning models using Python, evaluate how those models are performing, and apply those models to real-world problems</li>\n<li>Build a foundation for the use of Python in machine learning and artificial intelligence, preparing you for future Python study</li>\n</ul>",
    "locationRestrictions": [
      {
        "allowedIn": [
          "null"
        ],
        "blockedIn": [
          "null"
        ]
      }
    ],
    "skills": [
      "NumPy",
      "Data Science",
      "Python (Programming Language)",
      "Algorithms",
      "Artificial Intelligence",
      "Machine Learning",
      "R (Programming Language)",
      "Parsing",
      "Scientific Methods",
      "Matplotlib",
      "Pandas (Python Package)",
      "Scikit-learn (Machine Learning Library)"
    ],
    "skillsData": [
      {
        "skill": "NumPy",
        "category": "Analysis",
        "subcategory": "Mathematical Software"
      },
      {
        "skill": "Data Science",
        "category": "Analysis",
        "subcategory": "Data Science"
      },
      {
        "skill": "Python (Programming Language)",
        "category": "Information Technology",
        "subcategory": "Scripting Languages"
      },
      {
        "skill": "Algorithms",
        "category": "Information Technology",
        "subcategory": "Computer Science"
      },
      {
        "skill": "Artificial Intelligence",
        "category": "Information Technology",
        "subcategory": "Artificial Intelligence and Machine Learning (AI/ML)"
      },
      {
        "skill": "Machine Learning",
        "category": "Information Technology",
        "subcategory": "Artificial Intelligence and Machine Learning (AI/ML)"
      },
      {
        "skill": "R (Programming Language)",
        "category": "Information Technology",
        "subcategory": "Other Programming Languages"
      },
      {
        "skill": "Parsing",
        "category": "Information Technology",
        "subcategory": "Computer Science"
      },
      {
        "skill": "Scientific Methods",
        "category": "Science and Research",
        "subcategory": "Research Methodology"
      },
      {
        "skill": "Matplotlib",
        "category": "Analysis",
        "subcategory": "Data Visualization"
      },
      {
        "skill": "Pandas (Python Package)",
        "category": "Analysis",
        "subcategory": "Data Science"
      },
      {
        "skill": "Scikit-learn (Machine Learning Library)",
        "category": "Information Technology",
        "subcategory": "Artificial Intelligence and Machine Learning (AI/ML)"
      }
    ],
    "level": [
      "Intermediate"
    ],
    "flexibility": "self_paced",
    "weeksToComplete": 8,
    "minHoursEffortPerWeek": 3,
    "maxHoursEffortPerWeek": 4,
    "isActive": true,
    "isPartOfProgram": true,
    "recentEnrollmentCount": 47680,
    "availability": [
      "Current"
    ],
    "language": [
      "English"
    ],
    "partner": [
      "Harvard University"
    ],
    "staff": [
      "pavlos-protopapas-3"
    ],
    "aiLanguages": {
      "translationLanguages": [
        "Arabic",
        "German - Germany",
        "Greek",
        "English",
        "Spanish - Latin America and Caribbean",
        "French",
        "Indonesian",
        "Italian - Italy",
        "Portuguese - Brazil",
        "Russian",
        "Thai",
        "Chinese - China"
      ]
    }
  }
}

Example: program (type = "program")

{
  "type": "program",
  "id": "0a185424-0687-4bdc-aed9-4574b736c1d6",
  "url": "https://www.edx.org/certificates/professional-certificate/harvardx-computer-science-for-databases-using-sql",
  "title": "Computer Science for Databases using SQL",
  "source_url": "https://www.edx.org/search?page=1&q=data&tab=program",
  "seed_type": "query",
  "seed_value": "data",
  "data": {
    "productImageUrl": "https://prod-discovery.edx-cdn.org/cdn-cgi/image/width=auto,height=auto,quality=75,format=webp/media/programs/card_images/0a185424-0687-4bdc-aed9-4574b736c1d6-dac9a8bd5abf.png",
    "productType": "Professional Certificate",
    "product": "Program",
    "attributes": [
      "Computer Science",
      "Data Analysis & Statistics",
      "Engineering",
      "Business & Management"
    ],
    "partnerName": "Harvard University",
    "partnerLogoUrl": "https://prod-discovery.edx-cdn.org/organization/logos/44022f13-20df-4666-9111-cede3e5dc5b6-2cc39992c67a.png",
    "fullDescription": "A comprehensive understanding of computer science principles, including algorithmic thinking, software development, and problem-solving. ,How to utilize real-world datasets to apply programming knowledge using languages like C, Python, and SQL. ,Database design principles and the importance of SQL language for effective data management. ,How to create and build databases, understanding how to connect SQL with other coding languages. ,Career-relevant skills with hands-on practice developing efficient information management strategies.",
    "shortDescription": "Gain hands-on experience building and analyzing datasets; building relational databases; and understanding how to connect SQL with other popular coding languages like Python and Java.",
    "productOverview": "<p>An estimated 120 zettabytes of data are created each year—that’s 21 zeroes—including new data captured, copied, and consumed. With that number growing annually, the requirements for database infrastructure, architecture, and storage are evolving just as rapidly.</p>\r\n \r\n<p>According to the U.S. Bureau of Labor Statistics, computer science for databases, including database administration, analysts, and architects, corresponds with these numbers with anticipated growth of 8% over the next 10 years, faster than the average for all occupations. To prepare yourself for a career in the industry, you must not only understand the basics of computer science, but also how to create relationships with the data being created or ingested.</p>\r\n \r\n \r\n<p>Using HarvardX’s most popular courses, CS50: Introduction to Computer Science as the foundation, learners explore how to think algorithmically and how to solve problems efficiently, using real-world data sets.You will build on those skills by developing the core competencies needed for database development and structures. By focusing on the primary database language of SQL, you will learn how to create data relationships, normalize data to decrease the potential for errors or redundancy, and automate and optimize searches.</p>",
    "courseCount": 2,
    "locationRestrictions": [
      {
        "allowedIn": [
          "null"
        ],
        "blockedIn": [
          "null"
        ]
      }
    ],
    "skills": [
      "Database Development",
      "Database Administration",
      "SQL (Programming Language)",
      "Computer Science",
      "Infrastructure Architecture"
    ],
    "skillsData": [
      {
        "skill": "Database Development",
        "category": "Information Technology",
        "subcategory": "Database Administration"
      },
      {
        "skill": "Database Administration",
        "category": "Information Technology",
        "subcategory": "Database Administration"
      },
      {
        "skill": "SQL (Programming Language)",
        "category": "Information Technology",
        "subcategory": "Query Languages"
      },
      {
        "skill": "Computer Science",
        "category": "Information Technology",
        "subcategory": "Computer Science"
      },
      {
        "skill": "Infrastructure Architecture",
        "category": "Information Technology",
        "subcategory": "IT Management"
      }
    ],
    "level": [
      "Introductory"
    ],
    "weeksToComplete": 19,
    "weeksToCompleteMin": 18,
    "weeksToCompleteMax": 19,
    "minHoursEffortPerWeek": 6,
    "maxHoursEffortPerWeek": 18,
    "courseUuids": [
      "da1b2400-322b-459b-97b0-0c557f05d017",
      "3e45c431-10df-423e-9f03-fb98b713cd4a"
    ],
    "isActive": true,
    "isPartOfProgram": false,
    "recentEnrollmentCount": 372100,
    "availability": [
      "Current"
    ],
    "staff": [
      "david-j-malan",
      "carter-zenke",
      "doug-lloyd",
      "brian-yu"
    ],
    "tags": [
      "c",
      "c-programming",
      "computer-programming",
      "partner",
      "python",
      "sql",
      "stardust-2019",
      "web-development"
    ],
    "language": [
      "English"
    ],
    "partner": [
      "Harvard University"
    ],
    "contentfulFields": {
      "excludedFromSearch": false,
      "excludedFromSeo": false
    }
  }
}

Example: executive_education_course (type = "executive_education_course")

{
  "type": "executive_education_course",
  "id": "3cb02711-99a9-44ac-9acd-ef20cb346e43",
  "url": "https://www.edx.org/executive-education/the-london-school-of-economics-and-political-science-data-analysis-for-management",
  "title": "Data Analysis for Management",
  "source_url": "https://www.edx.org/search?page=1&q=data&tab=executive-education",
  "seed_type": "query",
  "seed_value": "data",
  "productImageUrl": "https://prod-discovery.edx-cdn.org/cdn-cgi/image/width=auto,height=auto,quality=75,format=webp/media/course/image/3cb02711-99a9-44ac-9acd-ef20cb346e43-77f0a1058824.jpg",
  "productType": "Executive Education",
  "product": "Executive Education",
  "attributes": [
    "Data Analysis & Statistics"
  ],
  "partnerName": "LSE",
  "partnerLogoUrl": "https://prod-discovery.edx-cdn.org/organization/logo_override/3cb02711-99a9-44ac-9acd-ef20cb346e43-4a69d72dd837.png",
  "fullDescription": "<p>As the world becomes more data driven, future-focused professionals need to develop the quantitative skills to inform corporate decision-making and managerial strategy. Guided by experts from the London School of Economics and Political Science (LSE), this online certificate course empowers you with the knowledge and practical tools to understand, interpret, and communicate data relevant to your role and organisation. </p>\n<p>Develop an understanding of how data-driven models can improve your ability to make decisions in a fast-paced world. Over the course of eight weeks, you’ll participate in a capstone project and apply the techniques and concepts covered to extract business insights from a real data set. You’ll also gain experience using Tableau – a leading business intelligence and data analytics software – to visualize and report on insights extracted from data sets.  </p>\n<p>This LSE course will equip managers and analysts in a variety of industries with the skills to make data-driven decisions. Marketing and sales analysts will gain the ability to extract and interpret key business insights for competitive advantage. Similarly, finance, HR, and business analysts will develop data analysis skills that can be directly applied in their role and organisation. There are no formal prerequisites for this course, but some numerical literacy is advantageous, as well as a basic working knowledge of Microsoft Excel. You’ll be granted a student license to download and use Tableau free of charge, for the duration of the course.</p>\n<p><em>The content of this course also forms part of the 16-week online<a href=\"https://www.edx.org/executive-education/lse-data-driven-management-executive-programme\">LSE Data-Driven Management: Analysis, Visualisation, and Storytelling Executive Programme</a>, a powerful offering from LSE that combines visualisation and data analysis with business management acumen.</em><br />\n__</p>\n<p><img src='https://lh7-rt.googleusercontent.com/docsz/AD_4nXdFJ92h9_Tqq_Fyb-xo1f-H5dwRvbJ9vrBbBESQWbJzvZO9_YJrNjYXqCWoFfvR8q61o2YQIDSef_DkXUJsKGKriQq4xLnyQy44iWalmHhuQqyur-nclToluGv0hCxzXMqEuy9gRQ?key=soh3Wm7bwYjJDHysbRJaHm_f' width='195' height='178' />This Data Analysis for Management course is certified by the United Kingdom CPD Certification Service, and may be applicable to individuals who are members of, or are associated with, UK-based professional bodies. The course has an estimated 70 hours of learning.</p>\n<p>Note: should you wish to claim CPD activity, the onus is on you. The London School of Economics and Political Science (LSE) and GetSmarter accept no responsibility, and cannot be held responsible, for the claiming or validation of hours or points.</p>",
  "shortDescription": "<p>Learn the skills to analyse, interpret, and communicate data with confidence and impact within your organisation.</p>",
  "productOverview": "<p>On completion of this course, you’ll walk away with:An understanding of how data-driven models can improve your ability to make decisions in a fast-paced and uncertain world, and the ability to use modelling to predict future trends. Tableau data visualisation and reporting skills, with which to clearly communicate your findings and business needs. A capstone project as proof of your ability to analyse, summarise, visualise, and report on insights extracted from a dataset. Unlimited access to edX’s Career Engagement Network, offering you exclusive resources and events to support your professional journey and drive your career forward.</p>",
  "locationRestrictions": [
    {
      "allowedIn": [
        "null"
      ],
      "blockedIn": [
        "null"
      ]
    }
  ],
  "skills": [
    "Sales",
    "Economics",
    "Business Intelligence",
    "Ruby On Rails",
    "Project Management Institute (PMI) Methodology",
    "Data-Driven Decision-Making",
    "Management",
    "Political Sciences",
    "Statistical Analysis",
    "Microsoft Excel",
    "Marketing",
    "Data Analysis",
    "Blobs",
    "Tableau (Business Intelligence Software)",
    "Big Data",
    "Finance",
    "Decision Making"
  ],
  "skillsData": [
    {
      "skill": "Sales",
      "category": "Sales",
      "subcategory": "General Sales Practices"
    },
    {
      "skill": "Economics",
      "category": "Economics, Policy, and Social Studies",
      "subcategory": "Economics"
    },
    {
      "skill": "Business Intelligence",
      "category": "Analysis",
      "subcategory": "Business Intelligence"
    },
    {
      "skill": "Ruby On Rails",
      "category": "Information Technology",
      "subcategory": "Web Design and Development"
    },
    {
      "skill": "Project Management Institute (PMI) Methodology",
      "category": "Business",
      "subcategory": "Project Management"
    },
    {
      "skill": "Data-Driven Decision-Making",
      "category": "Business",
      "subcategory": "Business Analysis"
    },
    {
      "skill": "Management",
      "category": "Business",
      "subcategory": "Business Management"
    },
    {
      "skill": "Political Sciences",
      "category": "Economics, Policy, and Social Studies",
      "subcategory": "Social Studies"
    },
    {
      "skill": "Statistical Analysis",
      "category": "Analysis",
      "subcategory": "Data Analysis"
    },
    {
      "skill": "Microsoft Excel",
      "category": "Administration",
      "subcategory": "Office and Productivity Software"
    },
    {
      "skill": "Marketing",
      "category": "Marketing and Public Relations",
      "subcategory": "Marketing Management"
    },
    {
      "skill": "Data Analysis",
      "category": "Analysis",
      "subcategory": "Data Analysis"
    },
    {
      "skill": "Blobs",
      "category": "Information Technology",
      "subcategory": "Computer Science"
    },
    {
      "skill": "Tableau (Business Intelligence Software)",
      "category": "Analysis",
      "subcategory": "Business Intelligence Software"
    },
    {
      "skill": "Big Data",
      "category": "Analysis",
      "subcategory": "Data Science"
    },
    {
      "skill": "Finance",
      "category": "Finance",
      "subcategory": "Financial Accounting"
    },
    {
      "skill": "Decision Making",
      "category": "Physical and Inherent Abilities",
      "subcategory": "Initiative and Leadership"
    }
  ],
  "level": [
    "Introductory"
  ],
  "flexibility": "instructor_paced",
  "weeksToComplete": 8,
  "minHoursEffortPerWeek": 7,
  "maxHoursEffortPerWeek": 10,
  "isActive": true,
  "isPartOfProgram": false,
  "recentEnrollmentCount": 10,
  "availability": [
    "Upcoming"
  ],
  "language": [
    "English"
  ],
  "partner": [
    "LSE"
  ],
  "externalUrl": "https://onlinecertificatecourses.lse.ac.uk/presentations/lp/lse-data-analysis-for-management-online-certificate-course-pr/",
  "tags": [
    "brand_lse",
    "category_data-analytics",
    "category_management",
    "location_london",
    "location_uk",
    "vertical_executive-education"
  ],
  "aiLanguages": {
    "translationLanguages": [
      "Arabic",
      "English",
      "Spanish - Latin America and Caribbean",
      "Indonesian",
      "Portuguese - Brazil"
    ]
  }
}

Example: degree_program (type = "degree_program")

{
  "type": "degree_program",
  "id": "f968c923-2db3-4d29-8c0b-bbb0236f10bd",
  "url": "https://www.edx.org/masters/lancasteruniversity-msc-data-science-online",
  "title": "MSc Data Science online",
  "source_url": "https://www.edx.org/search?page=1&q=data&tab=degree-program",
  "seed_type": "query",
  "seed_value": "data",
  "productImageUrl": "https://prod-discovery.edx-cdn.org/cdn-cgi/image/width=auto,height=auto,quality=75,format=webp/media/programs/card_images/f968c923-2db3-4d29-8c0b-bbb0236f10bd-26dc8a398216.jpg",
  "productType": "Masters",
  "product": "2U Degree",
  "partnerName": "Lancaster University",
  "partnerLogoUrl": "https://prod-discovery.edx-cdn.org/organization/logos/8f9e16c2-cb0d-4e37-999a-4e7090ea9b69-9dd6e7cfbec5.png",
  "productOverview": "Fast-track your career with a programme that emphasises the latest research and applications in artificial intelligence (AI) and natural language processing (NLP). Graduate with a technical, analytical, and presentation skill set that’s in-demand and valued by employers across industries.",
  "courseCount": 1,
  "locationRestrictions": [
    {
      "allowedIn": [
        "null"
      ],
      "blockedIn": [
        "null"
      ]
    }
  ],
  "skills": [
    "Data Science",
    "Natural Language Processing",
    "Research",
    "Communications",
    "Teaching",
    "Artificial Intelligence",
    "Data Analysis",
    "Data Modeling",
    "Mathematical Sciences",
    "Machine Learning"
  ],
  "skillsData": [
    {
      "skill": "Data Science",
      "category": "Analysis",
      "subcategory": "Data Science"
    },
    {
      "skill": "Natural Language Processing",
      "category": "Analysis",
      "subcategory": "Natural Language Processing (NLP)"
    },
    {
      "skill": "Research",
      "category": "Science and Research",
      "subcategory": "Research Methodology"
    },
    {
      "skill": "Communications",
      "category": "Physical and Inherent Abilities",
      "subcategory": "Communication"
    },
    {
      "skill": "Teaching",
      "category": "Education and Training",
      "subcategory": "Teaching"
    },
    {
      "skill": "Artificial Intelligence",
      "category": "Information Technology",
      "subcategory": "Artificial Intelligence and Machine Learning (AI/ML)"
    },
    {
      "skill": "Data Analysis",
      "category": "Analysis",
      "subcategory": "Data Analysis"
    },
    {
      "skill": "Data Modeling",
      "category": "Analysis",
      "subcategory": "Data Analysis"
    },
    {
      "skill": "Mathematical Sciences",
      "category": "Analysis",
      "subcategory": "Mathematics and Mathematical Modeling"
    },
    {
      "skill": "Machine Learning",
      "category": "Information Technology",
      "subcategory": "Artificial Intelligence and Machine Learning (AI/ML)"
    }
  ],
  "weeksToComplete": -1,
  "weeksToCompleteMin": -1,
  "weeksToCompleteMax": -1,
  "minHoursEffortPerWeek": -1,
  "maxHoursEffortPerWeek": -1,
  "isActive": false,
  "isPartOfProgram": false,
  "recentEnrollmentCount": 0,
  "availability": [
    "Current"
  ],
  "partner": [
    "Lancaster University"
  ],
  "externalUrl": "https://www.lancaster.ac.uk/study/postgraduate/postgraduate-courses/data-science-msc/2025/",
  "contentfulFields": {
    "pageTitle": "Lancaster University MSc Data Science online",
    "excludedFromSearch": false,
    "excludedFromSeo": false,
    "subheading": "Fast-track your career with a programme that emphasises the latest research and applications in artificial intelligence (AI) and natural language processing (NLP). Graduate with a technical, analytical, and presentation skill set that’s in-demand and valued by employers across industries.",
    "featuredProducts": {
      "heading": "Curriculum",
      "introduction": "The online MSc Data Science programme delivers a career-centric curriculum that’s packed with practical, hands-on learning experiences. The programme is part of Lancaster’s renowned Data Science Institute, a leading research institute driving data science and AI innovations across industries in the UK and abroad. \n\nYou will graduate with specialised skills in various facets of data science, including AI, NLP, data modelling and machine learning. As part of the programme, you will independently design and conduct an original data analysis project to illustrate your mastery of industry-relevant AI skills.",
      "productList": [
        {
          "header": "Learn in a holistic curriculum",
          "description": "Academics from the Data Science Institute, School of Computing and Communications, School of Mathematical Sciences and the Security Lancaster institute collaborate to deliver coursework that reflects the interconnected nature of data science."
        },
        {
          "header": "Build a supportive learning community",
          "description": "A live, one-hour online class each week empowers you to connect with teaching staff and peers in real time to discuss coursework and develop lasting relationships."
        },
        {
          "header": "Go from introductory to advanced data science knowledge",
          "description": "The online MSc Data Science consists of six 20-credit modules that cover programming, natural language processing, AI ethics and large-scale platforms, culminating in one 60-credit project dissertation."
        }
      ]
    }
  }
}

Field reference

Course fields (`type = "course"`)

title (string, required): Course title.
source_url (string, optional): Source page where the listing was discovered.
seed_type (string, optional): Input source type (e.g., query or URL).
seed_value (string, optional): Input value that produced the record.
data.productImageUrl (string, optional): Primary image URL.
data.productType (string, optional): Displayed product type.
data.product (string, optional): High-level category label.
data.attributes (array[string], optional): Category/subject labels.
data.partnerName (string, optional): Partner or institution name.
data.partnerLogoUrl (string, optional): Partner logo URL.
data.fullDescription (string, optional): HTML description.
data.shortDescription (string, optional): Short HTML description.
data.productOverview (string, optional): Overview HTML.
data.locationRestrictions (array[object], optional): Availability by region if present.
data.partnerKeys (array[string], optional): Partner identifiers.
data.skills (array[string], optional): Skill labels.
data.skillsData (array[object], optional): Skills with taxonomy metadata.
data.level (array[string], optional): Level labels.
data.flexibility (string, optional): Pacing mode.
data.weeksToComplete (number, optional): Estimated duration in weeks.
data.minHoursEffortPerWeek (number, optional): Minimum hours/week.
data.maxHoursEffortPerWeek (number, optional): Maximum hours/week.
data.isActive (boolean, optional): Active listing indicator.
data.isPartOfProgram (boolean, optional): Program membership flag.
data.recentEnrollmentCount (number, optional): Recent enrollment count.
data.productSource (string, optional): Source label.
data.activeRunKey (string, optional): Run key when available.
data.displayOnOrgPage (boolean, optional): Display flag.
data.availability (array[string], optional): Availability status.
data.language (array[string], optional): Language labels.
data.partner (array[string], optional): Partner names.
data.staff (array[string], optional): Staff identifiers.
data.aiLanguages.translationLanguages (array[string], optional): Translation languages list.
data.showInAlgoliaSearchResults (boolean, optional): Search visibility flag.
data.availabilityRank (number, optional): Availability ranking value.
data.productCreated (number, optional): Creation timestamp.

Program fields (`type = "program"`)

title (string, required): Program title.
source_url (string, optional): Source page where the listing was discovered.
seed_type (string, optional): Input source type.
seed_value (string, optional): Input value that produced the record.
data.productImageUrl (string, optional): Primary image URL.
data.productType (string, optional): Displayed product type.
data.product (string, optional): High-level category label.
data.attributes (array[string], optional): Category/subject labels.
data.partnerName (string, optional): Partner or institution name.
data.partnerLogoUrl (string, optional): Partner logo URL.
data.fullDescription (string, optional): Program description.
data.shortDescription (string, optional): Short description.
data.productOverview (string, optional): Overview HTML.
data.courseCount (number, optional): Number of courses.
data.locationRestrictions (array[object], optional): Availability by region if present.
data.partnerKeys (array[string], optional): Partner identifiers.
data.skills (array[string], optional): Skill labels.
data.skillsData (array[object], optional): Skills with taxonomy metadata.
data.level (array[string], optional): Level labels.
data.weeksToComplete (number, optional): Estimated duration in weeks.
data.weeksToCompleteMin (number, optional): Minimum duration in weeks.
data.weeksToCompleteMax (number, optional): Maximum duration in weeks.
data.minHoursEffortPerWeek (number, optional): Minimum hours/week.
data.maxHoursEffortPerWeek (number, optional): Maximum hours/week.
data.courseUuids (array[string], optional): Included course identifiers.
data.isActive (boolean, optional): Active listing indicator.
data.isPartOfProgram (boolean, optional): Program membership flag.
data.recentEnrollmentCount (number, optional): Recent enrollment count.
data.productSource (string, optional): Source label.
data.displayOnOrgPage (boolean, optional): Display flag.
data.availability (array[string], optional): Availability status.
data.staff (array[string], optional): Staff identifiers.
data.tags (array[string], optional): Tags.
data.language (array[string], optional): Language labels.
data.partner (array[string], optional): Partner names.
data.showInAlgoliaSearchResults (boolean, optional): Search visibility flag.
data.contentfulFields.excludedFromSearch (boolean, optional): Search exclusion flag.
data.contentfulFields.excludedFromSeo (boolean, optional): SEO exclusion flag.
data.productCreated (number, optional): Creation timestamp.

Executive education course fields (`type = "executive_education_course"`)

title (string, required): Course title.
source_url (string, optional): Source page where the listing was discovered.
seed_type (string, optional): Input source type.
seed_value (string, optional): Input value that produced the record.
productImageUrl (string, optional): Primary image URL.
productType (string, optional): Displayed product type.
product (string, optional): High-level category label.
attributes (array[string], optional): Category/subject labels.
partnerName (string, optional): Partner or institution name.
partnerLogoUrl (string, optional): Partner logo URL.
fullDescription (string, optional): Course description.
shortDescription (string, optional): Short description.
productOverview (string, optional): Overview HTML.
locationRestrictions (array[object], optional): Availability by region if present.
partnerKeys (array[string], optional): Partner identifiers.
skills (array[string], optional): Skill labels.
skillsData (array[object], optional): Skills with taxonomy metadata.
level (array[string], optional): Level labels.
flexibility (string, optional): Pacing mode.
weeksToComplete (number, optional): Estimated duration in weeks.
minHoursEffortPerWeek (number, optional): Minimum hours/week.
maxHoursEffortPerWeek (number, optional): Maximum hours/week.
isActive (boolean, optional): Active listing indicator.
isPartOfProgram (boolean, optional): Program membership flag.
recentEnrollmentCount (number, optional): Recent enrollment count.
productSource (string, optional): Source label.
activeRunKey (string, optional): Run key when available.
partnerLogoOverride (string, optional): Alternate partner logo URL.
displayOnOrgPage (boolean, optional): Display flag.
availability (array[string], optional): Availability status.
language (array[string], optional): Language labels.
partner (array[string], optional): Partner names.
externalUrl (string, optional): External landing page URL.
tags (array[string], optional): Tags.
aiLanguages.translationLanguages (array[string], optional): Translation languages list.
showInAlgoliaSearchResults (boolean, optional): Search visibility flag.
availabilityRank (number, optional): Availability ranking value.
productCreated (number, optional): Creation timestamp.

Degree program fields (`type = "degree_program"`)

title (string, required): Degree program title.
source_url (string, optional): Source page where the listing was discovered.
seed_type (string, optional): Input source type.
seed_value (string, optional): Input value that produced the record.
productImageUrl (string, optional): Primary image URL.
productType (string, optional): Displayed product type.
product (string, optional): High-level category label.
partnerName (string, optional): Partner or institution name.
partnerLogoUrl (string, optional): Partner logo URL.
productOverview (string, optional): Overview description.
courseCount (number, optional): Number of courses.
locationRestrictions (array[object], optional): Availability by region if present.
partnerKeys (array[string], optional): Partner identifiers.
skills (array[string], optional): Skill labels.
skillsData (array[object], optional): Skills with taxonomy metadata.
weeksToComplete (number, optional): Estimated duration in weeks.
weeksToCompleteMin (number, optional): Minimum duration in weeks.
weeksToCompleteMax (number, optional): Maximum duration in weeks.
minHoursEffortPerWeek (number, optional): Minimum hours/week.
maxHoursEffortPerWeek (number, optional): Maximum hours/week.
isActive (boolean, optional): Active listing indicator.
isPartOfProgram (boolean, optional): Program membership flag.
recentEnrollmentCount (number, optional): Recent enrollment count.
productSource (string, optional): Source label.
displayOnOrgPage (boolean, optional): Display flag.
availability (array[string], optional): Availability status.
partner (array[string], optional): Partner names.
externalUrl (string, optional): External landing page URL.
showInAlgoliaSearchResults (boolean, optional): Search visibility flag.
contentfulFields.pageTitle (string, optional): Page title.
contentfulFields.excludedFromSearch (boolean, optional): Search exclusion flag.
contentfulFields.excludedFromSeo (boolean, optional): SEO exclusion flag.
contentfulFields.subheading (string, optional): Subheading text.
contentfulFields.featuredProducts.heading (string, optional): Featured section heading.
contentfulFields.featuredProducts.introduction (string, optional): Featured section introduction.
contentfulFields.featuredProducts.productList (array[object], optional): Featured items list.
contentfulFields.featuredProducts.productList.header (string, optional): Featured item header.
contentfulFields.featuredProducts.productList.description (string, optional): Featured item description.
productCreated (number, optional): Creation timestamp.

Data guarantees & handling

Best-effort extraction: fields may vary by region/session/availability/UI experiments.
Optional fields: null-check in downstream code.
Deduplication: recommend type + ":" + id.

Notes & Limitations

Respect edX terms of service and applicable laws.
Avoid excessive frequency or unnecessary repeat runs.
Pricing, availability, and enrollment indicators may vary by region, time, or session.
Validate collected data for compliance with your policies.

Support

For help, open an issue on the actor page. Include the input you used (redact sensitive data), the run ID, expected vs. actual behavior, and a small sample of output if available.

edX Online Course Data Extractor

epctex/edx-scraper

Effortlessly scrape thousands of online courses from edX. Extract titles, images, details, owners, and all other course details. Customize your search with filters like language and more for precise results.

epctex

152

5.0

Edx Allcourse Details Spider

getdataforme/Edx-allcourse-details-spider

The Edx Allcourse Details Spider is an Apify Actor that scrapes comprehensive details on all edX courses, including titles, descriptions, partners, subjects, levels, and skills....

GetDataForMe

EdX Course Scraper 🎓

shahidirfan/edx-course-scraper

Power your edtech insights with this ultimate EdX Course Scraper. Instantly extract detailed online course data, including syllabi, instructors, pricing, and reviews. Perfect for e-learning aggregators and market researchers. Streamline your education data collection today!

Shahid Irfan

Coursera Scraper | All In One | $4 / 1k

fatihtahta/coursera-scraper

Scrape Coursera into clean, structured course and review data. $4 per 1,000 results. Capture titles, pricing and discounts, ratings, popularity, lecture counts, levels, languages, image and more. Ideal for course market research, competitor analysis, and building targeted lead lists.

Fatih Tahta

Coursera Course Catalog Scraper

fortuitous_pirate/coursera-course-scraper

Scrape online courses from Coursera. Get course titles, descriptions, instructors, ratings, duration, and enrollment counts. Great for education research.

Fortuitous Pirate

Skillshare Scraper | All In One | $4 / 1k

fatihtahta/skillshare-scraper

Scrape Skillshare into clean class and instructor data. Pull titles, ratings, student counts, duration, level, language, topics, popularity ratings and full teacher profiles. Great for course research, catalog building, trend tracking and instructor outreach

Fatih Tahta

Udemy Scraper | $4 / 1k | All In One

fatihtahta/udemy-scraper

Scrape Udemy into clean, structured course, review and instructor data. $4 per 1,000 results. Capture titles, pricing and discounts, ratings, popularity, lecture counts, levels, languages, images, and profiles. Ideal for course market research, competitor analysis, and building targeted lead lists.

Fatih Tahta

Steam Scraper | Fast & Reliable | $3 / 1k

fatihtahta/steam-scraper

Scrape Steam into clean, structured game listings. Get pricing/discounts details, review signals, tags, release details, screenshots, supported platforms, and full system requirements. Ideal for catalog building, price tracking, trend research, and market intelligence.

Fatih Tahta

Cars.com Scraper | $3 / 1k | Fast & Reliable

fatihtahta/cars-com-scraper

Scrape vehicle listings from Cars.com with live pricing, mileage, trims, dealers, and reviews. Perfect for pricing analysis, dealer tracking, and market research. Clean, structured, and export-ready. $3 / 1k listings.

Fatih Tahta

Walmart Scraper | $1 / 1k | Fast & Reliable

fatihtahta/walmart-scraper

Scrape structured product data from Walmart.com including prices, brands, ratings, stock availability and more. Ideal for price tracking, catalog enrichment, or market research. Fast, reliable, and export-ready. $1 / 1k products

Fatih Tahta

edX Scraper | All In One | $3 / 1k

EDX Scraper | All In One

Overview

Why Use This Actor

Input Parameters

Example Input

Output

6.1 Output destination

6.2 Record envelope (all items)

6.3 Examples

Field reference

Course fields (type = "course")

Program fields (type = "program")

Executive education course fields (type = "executive_education_course")

Degree program fields (type = "degree_program")

Data guarantees & handling

Notes & Limitations

Support

You might also like

edX Online Course Data Extractor

Edx Allcourse Details Spider

EdX Course Scraper 🎓

Coursera Scraper | All In One | $4 / 1k

Coursera Course Catalog Scraper

Skillshare Scraper | All In One | $4 / 1k

Udemy Scraper | $4 / 1k | All In One

Steam Scraper | Fast & Reliable | $3 / 1k

Cars.com Scraper | $3 / 1k | Fast & Reliable

Walmart Scraper | $1 / 1k | Fast & Reliable

Course fields (`type = "course"`)

Program fields (`type = "program"`)

Executive education course fields (`type = "executive_education_course"`)

Degree program fields (`type = "degree_program"`)