# Jobsplus Course Info Parser Spider (`getdataforme/jobsplus-course-info-parser-spider`) Actor

This specialized spider extracts comprehensive, structured data on jobseeker courses from the Jobsplus portal. It captures over 15 key data points—including title, fee, duration, MQF level, and detailed aims—for market analysis and educational research....

- **URL**: https://apify.com/getdataforme/jobsplus-course-info-parser-spider.md
- **Developed by:** [GetDataForMe](https://apify.com/getdataforme) (community)
- **Categories:** AI, Automation, E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $9.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

---

### PART 1: Generate README.md

```markdown
## 📚 Jobsplus Course Info Parser Spider

The Jobsplus Course Info Parser Spider is a specialized web scraping tool designed to extract comprehensive and structured data from the Jobsplus portal. It efficiently collects detailed information about jobseeker courses, making it invaluable for market analysis, educational research, and content aggregation.

### ✨ Features

*   **Comprehensive Data Extraction:** Captures over 15 key data points for each course, including title, fee, duration, location, and detailed aims.
*   **Structured Output:** Provides clean, JSON-formatted data, making it immediately usable for databases and analytics tools.
*   **Multi-URL Support:** Easily process multiple course URLs in a single run, saving time and effort.
*   **Rich Metadata Capture:** Extracts specific details like MQF level, delivery mode, and category tags for deep analysis.
*   **Robust Handling:** Designed to handle variations in course page layouts and data availability.

### ⚙️ Input Parameters

The spider requires configuration via the following parameters:

| Parameter | Type | Required | Description | Example |
| :--- | :--- | :--- | :--- | :--- |
| `Urls` | Array of Strings | No | The list of specific URLs containing the course details that you wish to scrape. | `["https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4991", "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4315"]` |

### 🚀 Example Usage

#### Input JSON

To run the spider, provide an array of URLs:

```json
{
  "Urls": [
    "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4991",
    "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4315"
  ]
}
````

#### Output JSON

The resulting data is an array of objects, where each object represents a scraped course:

```json
[
  {
    "url": "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4991",
    "course_title": "AWARD IN MOVING AND HANDLING FOR CARE WORKERS",
    "image_url": "https://stjobspluslegacyprod001.blob.core.windows.net/libx-4/4406/Moving and Handling.jpg",
    "delivery_mode": "Classroom",
    "fee": "Free",
    "mqf_level": "MQF Level 3",
    "duration_text": "6 hours",
    "duration_hours": 6,
    "location": "HAL FAR, BIRZEBBUGA",
    "language": "ENGLISH",
    "tags": [],
    "aim": "This module aims to give learners the required skills and knowledge on how to safely move and handle patients so that neither the patients nor themselves get hurt.",
    "course_contents_text": "Please click on the following link to view the detailed course content - https://jobsplus.gov.mt/media/10zldinr/award-in-moving-and-safe-handling-for-care-workers.pdf",
    "course_contents_url": "https://jobsplus.gov.mt/media/10zldinr/award-in-moving-and-safe-handling-for-care-workers.pdf",
    "category_tags": [
      "Care Workers"
    ],
    "sessions": [],
    "actor_id": "kVB0TyCOxrtI0vvAE",
    "run_id": "PbmXYVvHSTfc6LBFh"
  }
]
```

### 💡 Use Cases

- **Market Research:** Analyze the current educational offerings and trends in the Maltese job market.
- **Academic Research:** Gather structured data for studies on vocational training and skill gaps.
- **Content Aggregation:** Build comprehensive directories or databases of professional development courses.
- **Competitive Intelligence:** Track the types of courses and providers available in a specific sector.
- **Business Automation:** Automate the process of gathering training data for internal knowledge bases.

### 🛠️ Installation and Usage

1. **Search:** Search for "Jobsplus Course Info Parser Spider" in the Apify Store.
2. **Run:** Click "Try for free" or "Run" to start the process.
3. **Configure:** Input the list of URLs you wish to scrape into the `Urls` parameter.
4. **Start:** Click "Start" to begin extraction.
5. **Monitor:** Monitor the progress and logs in the Apify console.
6. **Export:** Once complete, export the results in your preferred format (JSON, CSV, or Excel).

### 📄 Output Format Details

The output is an array of JSON objects. Each object represents a single course and contains the following key fields:

- `url`: The source URL of the course page.
- `course_title`: The full title of the course.
- `image_url`: Direct link to the course image.
- `delivery_mode`: How the course is delivered (e.g., Classroom, Online).
- `fee`: The cost of the course (e.g., Free, Paid).
- `mqf_level`: The recognized qualification level (e.g., MQF Level 3).
- `duration_text`: The human-readable duration (e.g., "6 hours").
- `duration_hours`: The duration expressed as a numerical hour value.
- `location`: Physical location of the course.
- `language`: Language of instruction.
- `aim`: A detailed description of the course objectives.
- `course_contents_text`: Text content from the course syllabus/description.
- `course_contents_url`: Direct link to the course syllabus PDF or document.
- `category_tags`: List of relevant industry or skill tags.

### ⚠️ Limitations and Best Practices

- **Rate Limiting:** To ensure stable operation and respect the target site's infrastructure, it is best practice to run the spider with a manageable number of URLs (e.g., batches of 50-100).
- **Dynamic Content:** The spider is optimized for the current structure of the Jobsplus site. Significant changes to the target website may require updates to the actor.
- **Required Fields:** Ensure all provided URLs are active and accessible to the spider.

### 🚨 Error Handling

If the spider encounters a page that is inaccessible, redirects, or has a significantly different structure, it will log an error for that specific URL and continue processing the remaining URLs in the batch. Review the logs for detailed error messages.

### Support

For custom/simplified outputs or bug reports, please contact:

- Email: support@getdataforme.com
- Subject line: "custom support"
- Contact form: https://getdataforme.com

***

### Concise Summary (For quick reference)

**Use this tool to scrape structured data about educational courses from the Maltese government portal.**

**Key Data Points Extracted:**

- Course Title
- Course Description
- Provider/Institution
- Duration
- Level/Qualification
- Link to Course Details

**Best Practice:**

- Run the scraper in batches to avoid IP blocking.
- Always validate the extracted data against known schema standards.

# Actor input Schema

## `Urls` (type: `array`):

The urls for the spider.

## Actor input object example

```json
{
  "Urls": [
    "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4991",
    "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4315"
  ]
}
```

# Actor output Schema

## `results` (type: `string`):

Scraped data items from dataset

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("getdataforme/jobsplus-course-info-parser-spider").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("getdataforme/jobsplus-course-info-parser-spider").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call getdataforme/jobsplus-course-info-parser-spider --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=getdataforme/jobsplus-course-info-parser-spider",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Jobsplus Course Info Parser Spider",
        "description": "This specialized spider extracts comprehensive, structured data on jobseeker courses from the Jobsplus portal. It captures over 15 key data points—including title, fee, duration, MQF level, and detailed aims—for market analysis and educational research....",
        "version": "0.0",
        "x-build-id": "bJaHOsTUtzfbSy2ef"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/getdataforme~jobsplus-course-info-parser-spider/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-getdataforme-jobsplus-course-info-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/getdataforme~jobsplus-course-info-parser-spider/runs": {
            "post": {
                "operationId": "runs-sync-getdataforme-jobsplus-course-info-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/getdataforme~jobsplus-course-info-parser-spider/run-sync": {
            "post": {
                "operationId": "run-sync-getdataforme-jobsplus-course-info-parser-spider",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "Urls": {
                        "title": "Urls",
                        "minItems": 1,
                        "type": "array",
                        "description": "The urls for the spider.",
                        "default": [
                            "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4991",
                            "https://jobsplus.gov.mt/jobseeker-courses/course-details-jobseeker?id=4315"
                        ],
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
