Workday Enterprise Jobs Scraper avatar

Workday Enterprise Jobs Scraper

Pricing

from $2.40 / 1,000 job-results

Go to Apify Store
Workday Enterprise Jobs Scraper

Workday Enterprise Jobs Scraper

Scrape public Workday career sites (*.myworkdayjobs.com) for enterprise job postings and turn them into clean, CSV-ready hiring-intelligence data - no login, cookies, or residential proxy required.

Pricing

from $2.40 / 1,000 job-results

Rating

0.0

(0)

Developer

Delowar Munna

Delowar Munna

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Workday Enterprise Jobs Scraper

Scrape public Workday career sites (*.myworkdayjobs.com) for enterprise job postings and turn them into clean, flat, CSV-ready rows enriched with lightweight, non-AI hiring-signal fields (remote/hybrid flags, location breakdown, seniority hint, recency label, keyword hits, and signal tags).

It uses Workday's public, unauthenticated career-site JSON API ("CXS") over plain HTTP — no login, no cookies, no tenant credentials, no residential proxy. Output is a flat 28-field row per job, built for recruiters, sales teams, agencies, lead-gen, and market researchers.


✨ What it does

  • Accepts one or more public Workday career/search/job URLs.
  • Pages the public Workday listing endpoint and (optionally) fetches each job's public detail page for the full description, employment type, exact locations, posting date, and apply URL.
  • Normalizes everything into a flat, stable, CSV-friendly schema.
  • Deduplicates by Workday requisition ID and canonical job URL.
  • Adds non-AI hiring-signal fields derived only from visible scraped data.
  • Charges only for valid, unique, saved rows (pay-per-result).

🔗 Supported URLs

Any public Workday career site, for example:

https://company.wd1.myworkdayjobs.com/en-US/External
https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite
https://company.wd3.myworkdayjobs.com/External/job/Location/Title_R-12345 (single job)

Only *.myworkdayjobs.com URLs are supported. Other URLs are skipped and counted as failed inputs (the run still succeeds).


📥 Input

FieldTypeDefaultDescription
startUrlsarray of stringsNVIDIA examplePublic Workday career/search/job URLs. Career/search URLs are paginated; /job/... URLs return that single posting.
maxResultsinteger500Max saved unique jobs across the whole run (1–5000).
keywordsarray of strings[]Keep only jobs whose title/department/job family/location/description contains one of these terms.
locationsarray of strings[]Keep only jobs whose location text contains one of these terms.
remoteOnlybooleanfalseKeep only jobs that look remote or hybrid.
postedWithinDaysinteger / nullnullKeep only jobs posted within N days (1–365).
strictDateFilterbooleanfalseWhen a recency filter is set, drop jobs whose date couldn't be parsed.
includeJobDescriptionbooleantrueFetch each job's detail page for description + richer fields. Off = faster, fewer fields.
includeDerivedSignalsbooleantrueAdd signal_tags and keyword_hits.
deduplicatebooleantrueRemove duplicate jobs by requisition ID / canonical URL.
proxyConfigurationobjectApify ProxyDatacenter, no-proxy, or custom proxy URLs. Apify Residential is rejected.

Sample input — keyword + location filtering

{
"startUrls": ["https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite"],
"maxResults": 250,
"keywords": ["data", "engineer", "security"],
"locations": ["Australia", "Remote"],
"remoteOnly": false,
"postedWithinDays": 30,
"includeJobDescription": true,
"deduplicate": true,
"proxyConfiguration": { "useApifyProxy": true }
}

Sample input — fast listing-only run (no descriptions)

{
"startUrls": ["https://company.wd1.myworkdayjobs.com/en-US/External"],
"maxResults": 1000,
"includeJobDescription": false,
"proxyConfiguration": { "useApifyProxy": true }
}

📤 Output

Each row is a flat object with these 28 fields:

job_id, requisition_id, job_title, company_name, workday_tenant, department, job_family, employment_type, locations_text, primary_location, city, region, country, is_remote, is_hybrid, posted_date, posted_date_raw, recency_label, job_url, apply_url, description_text, description_html, seniority_hint, keyword_hits, signal_tags, source_input_url, source_platform, scraped_at.

Output preview

Workday Enterprise Jobs Scraper output — all fields

Sample record

{
"job_id": "JR2019248",
"requisition_id": "JR2019248",
"job_title": "Software Engineer, DGX Cloud AI Infrastructure",
"company_name": "NVIDIA",
"workday_tenant": "nvidia",
"department": "Engineering",
"job_family": "Engineering",
"employment_type": "Full time",
"locations_text": "US, CA, Santa Clara; US, TX, Austin; US, OR, Remote; US, WA, Remote; US, WA, Redmond",
"primary_location": "US, CA, Santa Clara",
"city": "Santa Clara",
"region": "CA",
"country": "United States of America",
"is_remote": true,
"is_hybrid": false,
"posted_date": "2026-06-03",
"posted_date_raw": "Posted Today",
"recency_label": "last_7_days",
"job_url": "https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Software-Engineer--DGX-Cloud-AI-Infrastructure_JR2019248",
"apply_url": "https://nvidia.wd5.myworkdayjobs.com/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Software-Engineer--DGX-Cloud-AI-Infrastructure_JR2019248",
"description_text": "NVIDIA is at the forefront of the generative AI revolution, building the software and systems that power the world's most advanced large language model workloads. We are ...",
"description_html": "<p><span>NVIDIA is at the forefront of the generative AI revolution, building th ...",
"seniority_hint": "mid",
"keyword_hits": 0,
"signal_tags": "remote;recent_posting;newer_than_30_days;engineering;data",
"source_input_url": "https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite",
"source_platform": "workday",
"scraped_at": "2026-06-04T05:50:24.326Z"
}

Sample from a real run. keyword_hits is 0 here because this run supplied no keywords; set keywords to count and filter on your own terms.

region and description_html depend on what a given Workday tenant exposes publicly and may be null. department and job_family are rarely published by Workday tenants; with includeDerivedSignals on (default) they're filled from a title-inferred role category (e.g. Engineering, Data & Analytics) — an inferred label, not Workday's own classification. Turn includeDerivedSignals off to keep them strictly source-only.

Run summary

A 14-field RUN_SUMMARY object is written to the default key-value store: inputs_total, successful_inputs, failed_inputs, raw_results_found, results_saved, duplicates_removed, filtered_out, charged_events, blocked_requests, retry_count, detail_pages_requested, detail_pages_failed, runtime_seconds, scraped_at.


💲 Pricing — Pay Per Result

This actor uses pay-per-event pricing with a single event:

EventFires
job-resultOnce per unique job row that passed all filters and was successfully saved to the dataset.

Duplicates, filtered-out rows, and failed requests are never charged. The per-event price is set on the Apify Store listing.

🚦 Proxy policy

Use Apify Datacenter proxy or no proxy for normal runs — both work reliably for public Workday career sites at this actor's conservative concurrency.

Apify Residential proxy is not supported. The actor fails at startup if apifyProxyGroups includes RESIDENTIAL. Reason: in pay-per-event actors, residential bandwidth (billed per GB) is charged to the developer, not the run user, so a single bandwidth-heavy run could exceed the per-result revenue.

If you genuinely need residential routing, supply your own provider via the proxy editor's Custom proxy URLs field — that traffic goes through your provider, not Apify, and is unaffected:

http://user:pass@proxy.iproyal.com:12321
http://user:pass@proxy.brightdata.com:22225
http://user:pass@proxy.oxylabs.io:7777

⚙️ How it works

  1. Collection — for each search/career URL the actor POSTs the public Workday CXS /jobs endpoint and pages by offset, building one row per posting. When includeJobDescription is on, each kept job's public detail endpoint is fetched and the row is enriched in place. A directly-pasted /job/... URL is fetched as a single posting.
  2. Finalize — keyword filters are applied, signal tags + keyword hits are computed, the valid-row rule is enforced (job_title plus a job_url/job_id/requisition_id), and each surviving row is pushed and charged.

No browser is used; everything runs over HTTP/JSON, so runs are fast and cheap.

🚫 Limitations

  • Public data only — no login, cookies, internal Workday APIs, or candidate data.
  • Fields are limited to what each tenant exposes publicly; some tenants expose less.
  • Workday caps search results at ~10,000 per query; slice by site/keyword for more.
  • No AI scoring, enrichment, salary normalization, or cross-site crawling in V1.