Jora Jobs Scraper avatar
Jora Jobs Scraper

Pricing

Pay per usage

Go to Apify Store
Jora Jobs Scraper

Jora Jobs Scraper

A lightweight actor to scrape Jora jobs. Extracts job titles, companies, locations, and descriptions. For best results and to avoid blocks, use residential proxies. This fast and efficient scraper is perfect for reliable, up-to-date job data collection.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Shahid Irfan

Shahid Irfan

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

1

Monthly active users

4 days ago

Last modified

Share

Jora.com Jobs Scraper ⚡

Fast and reliable Apify actor that collects Jora.com job listings across multiple regional domains with a stealthy, production-ready extraction stack.

What does Jora Jobs Scraper do?

This actor harvests job data from Jora.com (Australia, New Zealand, Singapore, Hong Kong, Indonesia, Malaysia) by leaning on the platform's RPC JSON API first, falling back to HTML/JSON-LD parsing, and only resorting to Playwright when stronger evasion is required. It can optionally enrich results by fetching job detail pages when collectDetails is enabled.

Features

  • Tiered Extraction Stack: RPC JSON API → HTML/JSON-LD parsing → Playwright browser only when needed.
  • Comprehensive Listing Data: Captures titles, companies, locations, salaries, job types, posting dates, and descriptions exposed via JSON-LD or listing cards.
  • Multi-Region Support: Switch between Jora domains (Australia, NZ, Singapore, Hong Kong, Indonesia, Malaysia) with one input parameter.
  • Pagination Handling: Automatically follows paginated search results until the requested job count or page limit is reached.
  • Flexible Filtering: Supports keyword search, location restrictions, and date-based filters.
  • Stealthy Requests: Randomizes headers, rotates proxies, and sanitizes payloads to reduce blocking.
  • Structured Output: Pushes clean records that match the dataset schema, avoiding schema-validation errors.
  • Configurable Limits: Control the scope with results_wanted and max_pages.

Extraction Strategy

  1. Tier 1 - JSON RPC API: Posts to /rpc/job_searches to grab the richest payload (titles, companies, descriptions, URLs) with one request.
  2. Tier 2 - gotScraping + HTML/JSON-LD: Parses listing pages with Cheerio, extracts JSON-LD job postings, and falls back to manual HTML scraping within the same request.
  3. Tier 3 - Playwright Browser (Listing Discovery): Uses a fingerprinted Firefox session to discover listing URLs when the first two tiers are blocked or incomplete.
  4. Detail Enrichment (Fast Path): When collectDetails is enabled, job detail pages are fetched via gotScraping + Cheerio (with browser cookies when available) and only fall back to Playwright for blocked pages.

Input Configuration

The actor accepts input parameters to customize the scraping behavior. All parameters are optional except where noted.

ParameterTypeDefaultDescription
keywordstring"software engineer"Job title or keywords to search for
locationstring-Geographic location to filter jobs (e.g., "New York, NY")
posted_datestring"anytime"Filter by posting date: anytime, 24h, 7d, 30d
collectDetailsbooleantrueEnrich results by fetching job detail pages (HTTP-first, browser fallback)
results_wantedinteger50Maximum number of jobs to scrape
max_pagesinteger20Maximum search result pages to visit
startUrlstring-Direct Jora.com search URL (overrides keyword/location)
countrystringAustraliaThe Jora regional site to search; prefer 'Australia' for QA-friendly default
proxyConfigurationobject-Proxy settings (Apify Proxy recommended)

Output Data

The actor outputs a dataset of job listings with the following standardized fields:

FieldTypeDescription
titlestringJob title
companystringCompany name
locationstringJob location
salarystringSalary or compensation information
job_typestringEmployment type (Full-time, Part-time, etc.)
date_postedstringPosting date
description_htmlstringJob description in HTML format
description_textstringJob description in plain text
urlstringDirect URL to the job posting

Sample Output

{
"title": "Senior Software Engineer",
"company": "Tech Solutions Inc",
"location": "Sydney, NSW",
"salary": "$120,000 - $150,000 per year",
"job_type": "Full time",
"date_posted": "2 days ago",
"description_html": "<div>We are seeking an experienced Senior Software Engineer...</div>",
"description_text": "We are seeking an experienced Senior Software Engineer to join our growing team...",
"url": "https://au.jora.com/job/senior-software-engineer-123456"
}

Usage

Basic Usage

  1. Navigate to the actor's page on Apify Store.
  2. Click Start to run with default settings, or configure inputs as needed.
  3. Monitor the run progress and download results when complete.

Advanced Configuration

For customized scraping:

  1. Provide a startUrl to pin the actor to any valid Jora search result.
  2. Use keyword, location, and country when you need to blend multiple markets.
  3. Adjust results_wanted and max_pages to control run length and volume.
  4. Tune posted_date to favor recent postings.
  5. Enable Apify Proxy (via proxyConfiguration) to stay under rate limits.

Examples

{
"keyword": "marketing manager",
"location": "London",
"results_wanted": 50
}

This configuration will scrape up to 50 marketing manager positions in London.

Example 2: Recent Postings Only

{
"keyword": "nurse",
"posted_date": "7d",
"max_pages": 10
}

Searches for nursing jobs posted in the last 7 days, limiting the crawl to 10 pages.

Example 3: Custom Start URL

{
"startUrl": "https://au.jora.com/j?sp=search&q=developer&l=Sydney%2C+NSW",
"results_wanted": 100
}

Uses a direct Jora.com search URL for Australian developer jobs in Sydney, aiming for 100 results.

Configuration Tips

  • Proxy Usage: For large-scale scraping, configure Apify Proxy (RESIDENTIAL group recommended) to avoid IP blocking.
  • Performance Optimization: Reduce results_wanted and max_pages for quick tests or larger keyword sweeps.
  • Result Limits: Keep max_pages roughly between results_wanted / 12 and results_wanted / 5 to balance speed with completeness.
  • Date Filtering: Use posted_date to focus on the most recent opportunities.

Use Cases

  • Market Research: Analyze job trends, salary ranges, and industry demands.
  • Recruitment: Identify potential candidates or job openings.
  • Career Planning: Research opportunities in specific locations or fields.
  • Data Analytics: Build datasets for machine learning or business intelligence applications.
  • Automated Monitoring: Set up recurring runs to track new job postings.

Performance

  • Speed: Typically completes within 3-5 minutes for 50 jobs thanks to the JSON API and listing-level JSON-LD data.
  • Efficiency: ~10-15 seconds per page, ~15-20 jobs per minute thanks to the RPC + HTML/JSON-LD hybrid.
  • Resource Usage: Optimized for standard Apify compute units.
  • Reliability: Built-in retry logic, proxy rotation, and anti-blocking measures.

Best Practices

  • Proxy Configuration: Always use Apify Proxy (RESIDENTIAL group recommended) for reliable scraping and to avoid IP blocks.
  • Data Quality: The actor pulls JSON-LD whenever available, so include location filters and a realistic results_wanted to ensure structured data.
  • Result Limits: Balance results_wanted and max_pages appropriately (recommended: max_pages ≈ results_wanted / 12 + 2).
  • Performance vs Quality: If you only need sampling data, lower results_wanted first, then trim max_pages.

Notes

  • Default settings are optimized to pass Apify's quality assurance tests.
  • Supports all Jora country-specific domains (Australia, NZ, Singapore, Hong Kong, Indonesia, Malaysia).
  • For extensive datasets (500+ jobs), consider increasing max_pages and allowing longer run times.
  • The actor normalizes every record before saving to prevent schema-validation errors.