Hiring Cafe Jobs Scraper avatar

Hiring Cafe Jobs Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
Hiring Cafe Jobs Scraper

Hiring Cafe Jobs Scraper

Extract global job postings from hiring.cafe including title, company, salary, location, remote status, seniority, visa sponsorship, and more.

Pricing

from $3.00 / 1,000 results

Rating

5.0

(14)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

12

Bookmarked

1

Total users

0

Monthly active users

4 days ago

Last modified

Share

Extract global job postings from hiring.cafe — an AI-powered job aggregator that indexes 2.9+ million listings from Greenhouse, Lever, Workable, Workday, SuccessFactors, Hirebridge, BambooHR, and 14,000+ direct company career pages. Returns 32 structured fields per job including title, company, workplace type, seniority, salary (when disclosed), geolocation, required technical tools, company industries, and a direct apply URL.

Features

  • 32 output fields per job — complete flat schema with typed defaults (zero nulls)
  • Direct apply URLs — every job links back to the original ATS posting
  • Rich metadata — workplace type (Remote/Hybrid/Onsite), seniority, commitment, category, required technical tools, min years experience, salary range when disclosed
  • Geolocation — latitude/longitude per listing
  • Company enrichment — hiring company name, website, industries, HQ country, employee count bucket
  • Filter support — keyword search, workplace type, seniority level, commitment type, date range
  • Hardcoded RESIDENTIAL US proxy — required to bypass Cloudflare Managed Challenge
  • Automatic Cloudflare bypass — Patchright Chromium session with session rotation (typically solves in 8–20 seconds)

Input

FieldTypeDescription
searchQueriesArray of stringsKeywords to search on hiring.cafe (e.g., "software engineer", "data scientist"). Empty array returns unfiltered results.
workplaceTypesArrayFilter by workplace type: Remote, Hybrid, Onsite (default: all three)
seniorityLevelsArrayNo Prior Experience Required, Entry Level, Mid Level, Senior Level, Director, Executive (default: all six)
commitmentTypesArrayFull Time, Part Time, Contract, Internship, Temporary, Seasonal, Volunteer (default: all seven)
dateFetchedPastNDaysIntegerOnly include jobs fetched within the last N days (default 30, max 365)
maxItemsIntegerMaximum jobs to return across all queries (default 50, max 1000)

Example Input

{
"searchQueries": ["software engineer", "data scientist"],
"workplaceTypes": ["Remote", "Hybrid"],
"seniorityLevels": ["Mid Level", "Senior Level"],
"commitmentTypes": ["Full Time"],
"dateFetchedPastNDays": 7,
"maxItems": 100
}

Minimal input (just a keyword):

{
"searchQueries": ["nurse"],
"maxItems": 20
}

Output

Each job has 32 fields. Every field is always present with a typed default (empty string, zero, empty list, or false) — never null.

Identity

FieldTypeDescription
idStringUnique job ID
objectIDStringSearch index object ID
sourceStringSource ATS (greenhouse, lever, workable, successfactors, hirebridge, ...)
boardTokenStringJob board token on that ATS
applyUrlStringDirect apply URL (redirects to original source)
titleStringJob title
descriptionStringJob description (HTML stripped, truncated to 2,000 chars)
isExpiredBooleanWhether the listing is expired

Classification

FieldTypeDescription
coreJobTitleStringCanonicalized core job title
categoryStringJob category (e.g., Engineering, Data and Analytics, Marketing)
seniorityLevelStringComma-joined seniority tags (e.g., "Mid Level", "Senior Level")
roleTypeStringIndividual Contributor or People Manager
commitmentStringCommitment type (e.g., "Full Time")
workplaceTypeStringRemote, Hybrid, or Onsite

Location

FieldTypeDescription
workplaceCountriesArrayCountry codes (e.g., ["US", "GB"])
workplaceStatesArrayState / region names
workplaceCitiesArrayCity names
latitudeNumberPrimary location latitude
longitudeNumberPrimary location longitude

Compensation & Requirements

FieldTypeDescription
salaryMinNumberMinimum salary (yearly, if disclosed; 0 otherwise)
salaryMaxNumberMaximum salary (yearly, if disclosed)
salaryCurrencyStringSalary currency code
salaryFrequencyStringYearly, Hourly, Monthly, etc.
technicalToolsArrayRequired technologies / tools
minYearsExperienceIntegerMinimum YoE required (0 if not specified)
bachelorsDegreeRequirementStringBachelor's degree requirement level

Company

FieldTypeDescription
companyNameStringHiring company name
companyWebsiteStringCompany homepage URL
companyIndustriesArrayIndustry tags
companyEmployeesStringEmployee count bucket (e.g., "1001-5000")
companyHqCountryStringCompany HQ country code

Metadata

FieldTypeDescription
scrapedAtStringISO 8601 scrape timestamp

Example Output

{
"id": "greenhouse___acme___4567890",
"objectID": "greenhouse_acme_4567890",
"source": "greenhouse",
"boardToken": "acme",
"applyUrl": "https://boards.greenhouse.io/acme/jobs/4567890",
"title": "Senior Software Engineer, Backend",
"description": "We are looking for a senior backend engineer to join our Platform team...",
"coreJobTitle": "Software Engineer",
"category": "Engineering",
"seniorityLevel": "Senior Level",
"roleType": "Individual Contributor",
"commitment": "Full Time",
"workplaceType": "Remote",
"workplaceCountries": ["US"],
"workplaceStates": ["California", "New York"],
"workplaceCities": ["San Francisco", "New York"],
"latitude": 37.7749,
"longitude": -122.4194,
"salaryMin": 160000.0,
"salaryMax": 220000.0,
"salaryCurrency": "USD",
"salaryFrequency": "Yearly",
"technicalTools": ["Python", "PostgreSQL", "Kubernetes", "Django"],
"minYearsExperience": 5,
"bachelorsDegreeRequirement": "Required",
"companyName": "Acme Corp",
"companyWebsite": "https://acme.example.com",
"companyIndustries": ["Software", "SaaS"],
"companyEmployees": "1001-5000",
"companyHqCountry": "US",
"isExpired": false,
"scrapedAt": "2026-04-11T11:05:00+00:00"
}

FAQ

Q: Why is a RESIDENTIAL proxy required? Hiring.cafe uses Cloudflare Managed Challenge (Turnstile) which blocks all Apify datacenter IPs with 403 Just a moment.... A real residential IP + a Chrome browser session with a ~10–20 second challenge-solve wait is needed. The proxy is hardcoded and applied automatically — no configuration needed from your side.

Q: How does the Cloudflare bypass work? The scraper launches a Patchright Chromium browser on a RESIDENTIAL US proxy session, navigates to hiring.cafe, and waits for the Cloudflare managed challenge to self-solve (typically 8–20 seconds). Once document.title changes from "Just a moment..." to "HiringCafe - AI Job Search", the browser has a valid cf_clearance cookie. All API calls are then made via in-browser fetch() so the Cloudflare session cookies are reused. If the first attempt doesn't solve within 60 seconds, the scraper rotates the proxy session and tries again (up to 6 attempts).

Q: How does pagination work? The API returns ~120–160 jobs per page. The scraper walks pages via &page=N until maxItems is reached or results run out.

Q: Why are some salaryMin / salaryMax values zero? Fewer than half of listings disclose salary on hiring.cafe. When not disclosed, both fields are 0.0 (typed default, not null).

Q: Can I search by company? Not yet in this scraper's input schema — you can achieve it by putting the company name in searchQueries, which matches against job title, description, AND company name.

Q: What's the difference between seniorityLevel and minYearsExperience? seniorityLevel is the categorical tag (Mid Level, Senior Level, etc.) assigned by hiring.cafe's classifier. minYearsExperience is the numeric minimum YoE extracted from the job description (may be 0 if not specified).

Q: Are expired jobs included? No — by default hiring.cafe only returns active listings (isExpired: false). The field is preserved in the output for consistency.

Q: How fresh is the data? dateFetchedPastNDays controls freshness. Default is 30 days. Set to 7 for last-week postings, or 1 for last-24-hours monitoring.

Use Cases

  • Talent intelligence — monitor hiring velocity for specific roles, companies, or regions
  • Compensation research — aggregate salary ranges by role, seniority, and location (where disclosed)
  • Remote-work trends — filter by workplaceTypes: ["Remote"] to track the remote-first market
  • ATS market share analysis — group by source to see which ATS platforms are most used
  • Skills demand tracking — aggregate technicalTools frequencies to spot rising technologies
  • Job alerts — daily runs with narrow filters (e.g., ["python developer", "Remote"]) to monitor new postings
  • Recruitment pipelines — bulk-import matching listings into CRMs with all 32 fields ready to use