Bayt Jobs Scraper avatar
Bayt Jobs Scraper

Pricing

Pay per usage

Go to Apify Store
Bayt Jobs Scraper

Bayt Jobs Scraper

Developed by

Shahid Irfan

Shahid Irfan

Maintained by Community

A simple and lightweight actor to quickly scrape job listings from Bayt.com. It's designed to be fast and easy to use, providing a clean dataset. For a smooth and uninterrupted run, using Residential Proxies is highly recommended to prevent blocking and ensure reliable data extraction.

5.0 (1)

Pricing

Pay per usage

0

2

2

Last modified

2 days ago

Bayt.com Jobs Scraper

This Apify actor is a fast and efficient scraper for extracting job postings from Bayt.com. It is built with Node.js, using Crawlee and Cheerio for high-performance crawling without a full browser, making it lightweight and cost-effective.

The scraper starts from a specific Bayt.com jobs URL, intelligently paginates through search results, and extracts detailed information for each job found.

Features

  • URL-First Scraping: Simply provide a Bayt.com search or listing URL to start.
  • Robust Pagination: Automatically finds and follows "Next" page links to crawl through all available results.
  • Flexible Limits: Control the scope of the crawl with maxJobs and maxPages to manage costs and runtime.
  • Detailed Data Extraction: Scrapes comprehensive job details, prioritizing structured data (JSON-LD) and falling back to HTML elements for reliability.
  • Lightweight & Fast: Uses Cheerio for server-side HTML parsing, avoiding the overhead of a headless browser.
  • Proxy Support: Integrates with Apify Proxy, including residential proxies, to avoid blocking.

Input

The actor requires a Bayt.com jobs URL to begin. You can customize the run with the following input fields.

FieldTypeDescription
urlStringRequired. The starting URL for a Bayt.com job listing or search results page. If not provided, it defaults to the international jobs page.
maxJobsIntegerThe maximum number of jobs to scrape. The scraper will stop once this limit is reached. If empty, it will scrape all available jobs.
maxPagesIntegerA safety limit on the number of listing pages to visit. The scraper will stop after visiting this many pages. If empty, it will follow pagination until the last page.
proxyConfigurationObjectSpecifies the proxy settings for the run. Using Apify Proxy with residential proxies is recommended for longer or larger crawls to prevent being blocked.

Example Input

{
"url": "https://www.bayt.com/en/uae/jobs/software-engineer-jobs/",
"maxJobs": 100,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Output

The scraper stores each job posting as a JSON object in the dataset. Here is an example of the output structure:

{
"source": "bayt.com",
"url": "https://www.bayt.com/en/international/jobs/senior-software-engineer-backend-1234567/",
"jobId": "1234567",
"title": "Senior Software Engineer (Backend)",
"company": "Awesome Tech Inc.",
"location": "Dubai, United Arab Emirates",
"postedAt": "2023-10-27T10:00:00Z",
"validThrough": "2023-11-26T23:59:59Z",
"employmentType": "Full-time",
"salary": "15000 - 20000 AED",
"descriptionText": "Job summary and responsibilities...",
"descriptionHtml": "<div>Job summary and responsibilities...</div>",
"requirements": ["5+ years of experience in Backend development.", "Proficiency in Node.js and Python."],
"scrapedAt": "2023-10-27T12:34:56.789Z"
}