All notable changes to this Actor are documented here.
- Improved the public README, Actor metadata, input labels, output links, dataset field descriptions, and run log messages.
- Added live filters for title, location, department, organization, description, remote-only jobs, workplace type, employment type, experience level, and posted-after dates.
- Added
descriptionType so users can keep text descriptions, HTML descriptions, or both.
- Added
csv_export to the Actor output schema.
- Changed
maxItems to limit jobs after filters are applied.
- Added salary range parsing from description text when Greenhouse pay transparency fields are not available.
- Added better company name and website derivation from Greenhouse payloads and custom company domains.
- Added canonical URL/date fields without duplicate aliases.
- Added best-effort derived fields, including
normalized_title, employment_type, workplace_type, experience_level, compensation, responsibilities, key_skills, and qualifications.
- Improved responsibility extraction so generic "About the role" sections do not pollute the
responsibilities array.
- Improved role summaries so common company boilerplate is skipped in favor of role-specific intro text.
- Improved requirement extraction so section labels such as
Requirements: and You'll thrive... prompts are not saved as requirement items.
- Normalized Greenhouse date fields to UTC
Z timestamps.
- Improved salary extraction by inferring annual pay periods for large compensation ranges when the posting omits the unit.
- Changed run planning so
maxItems and filters are applied before job detail requests, avoiding unnecessary network calls on small or narrowly filtered runs.
- Improved derived field quality for
employment_type, experience_level, and responsibilities.
- Avoided expensive pre-detail HTML extraction when no filters are configured.
- Removed duplicate description aliases from dataset items. Output now keeps one text field,
description, and one HTML field, descriptionHtml.
- Removed empty and repetitive compatibility fields from dataset items and schema.
- Removed raw
departments and offices objects from dataset items; normalized department, location, and locations remain.
- Improved compensation extraction to aggregate multiple salary zones into one min/max range with nested
ranges.
- Changed
source to the string value greenhouse.
- Updated the Output tab links and dataset views for easier scanning and exporting.
- Changed per-URL failures to run-summary errors so users get a clear explanation instead of an empty failed run when one input URL cannot be scraped.
- Added clear run summaries for missing URLs and invalid
maxItems API input.
- Added the initial Python Apify Actor for scraping public Greenhouse job boards.
- Added reliable live Greenhouse requests.
- Added a simple input schema focused on URLs and filters.
- Added clear output and dataset views for Apify Console.
- Added extraction for requirements, benefits, pay transparency ranges, and application questions.
- Added coverage for URL parsing, HTML extraction, output transformation, and schema wiring.