Detect companies actively hiring GTM teams. Scrapes Greenhouse, Lever, Ashby, Workday, and Rippling for Sales, RevOps, and Growth roles. Tracks velocity and role changes across runs. Flat output built for Clay enrichment columns. Pay per result.
GTM_REGEX_PATTERNS export. Catches Sales Leadership "Director, X Sales" word-order variants (e.g. "Director, Enterprise Sales") that the seed list cannot express because "sales director" requires the inverse word order.
New EXCLUSION_PATTERNS entries with unless-clause support: finance manager (unless revenue), product manager (unless product marketing), data engineer (unless gtm/revenue), hr business partner, people partner, hrbp, recruiting, recruiter, talent acquisition, talent partner, accounting, accountant, legal, counsel, paralegal, software engineer (unless gtm/growth/revenue/forward deployed/forward-deployed/fde).
CRO_DISQUALIFIERS rule applied to the GTM keyword path: cro match is dropped when the title contains conversion, optimization, website, rate optimization, or growth manager. Prevents Growth Manager, Website CRO style titles from classifying as Sales Leadership.
Changed
matchesKeyword helper replaces the previous title.includes(keyword) check. Word-boundary regex with escaped special chars, case-insensitive. Punctuation (comma, pipe, hyphen, parenthesis) is treated as a non-word boundary so "Manager, Solutions Engineering" and "Director, Enterprise Sales" match correctly.
classifyRoles now runs the exclusion check before keyword matching. Recruiter and software-engineer exclusions take precedence over any GTM seed match (the close-out fix for Recruiter, Forward Deployed Engineering).
isExcludedRole supports unless clauses on each exclusion entry. Pattern is matched via the same word-boundary helper.
Fixed
head of revenue keyword is now in GTM_KEYWORDS. The phrase was already listed in main.js TIER_1_KEYWORDS for signal-strength tiering but missing from the detection list, so head-of-revenue postings were never matched. Confirmed in c3-p2-02 taxonomy notes as the head of revenue detection bug.
"forward deployed software engineer" added as an explicit GTM Engineering seed plus "forward deployed" / "forward-deployed" / "fde" added to the software-engineer exclusion's unless-list. Recovers FDE Software Engineer city variants from the generic software-engineer exclusion.
openai.com (ashby): 171 GTM roles inside 80-220 band. Single-call ashby runtime 42s for 660 jobs, within the 60s local threshold.
gitlab.com (greenhouse): 69 GTM roles (taxonomy v1 ceiling for gitlab; below the original 80-180 band, see Known limitations).
Positive controls from prior runs unchanged in shape: aircall.io (lever) 32, linear.app (ashby) 7.
Known limitations
hubspot.com returns 0 GTM roles. HubSpot publishes via Workday and Actor 1 has no Workday scraper. Tracked as c3-p2-06 (Actor 1 Workday Scraper sprint).
gitlab.com tops out at 69 GTM roles against an 80-180 spec band. The 11 missing titles are Field CTO / Professional Services Engineer / Engagement Manager / Program Manager / Enablement Lead variants that are real GTM-adjacent roles but are not taxonomy v1 seeds. Expansion is deferred so v0.1.0 ships against the c3-p2-02 source of truth without ad-hoc additions.
Title typos in source data (e.g. gitlab.com's "Forward Deployed Engnieer") are not catchable without spell tolerance.
Source
Sprint c3-p2-03. Keyword set sourced directly from gtm-title-scraper output/phase_3_categorization/category_keywords.json (source="seed" entries, 206 unique). See Notion v0.1.0 Keyword Refresh Source of Truth page.
[0.0.19] - 2026-05-11
Added
Runtime output validator at src/lib/validateOutput.js. Plain JS, no external dependencies. Type and enum checks across all 22 dataset fields (domain, company_name, ats_platform, slug_source, gtm_hiring_signal, gtm_role_count, gtm_roles, gtm_signals, gtm_roles_detected, signal_strength, top_gtm_role, career_page_url, run_date, scraped_at, role_count_delta, velocity_signal, days_between_runs, change_detected, change_type, new_gtm_roles, removed_gtm_roles, error). Enum locked on ats_platform (greenhouse|lever|ashby|null) and signal_strength (high|medium|low|null).
buildEmptyRecord(domain, errorMessage) helper in src/utils.js. Returns the canonical 22-field null-shape record with run_date set inside the helper. Replaces two duplicate literal blocks in main.js (early-bail and catch paths).
Changed
All three Actor.pushData() call sites in main.js now pass through validateOutput() before push. Validation failures log to console and attach the error array as _validation_error JSON string on the record. Record is still pushed so downstream Clay can see and triage broken rows.
Early-bail at main.js:84 (missing or empty domain in input) now pushes a fully-shaped 22-field record instead of a 2-field stub. Guarantees dataset-shape consistency across all push paths so Clay templates see uniform columns.
[0.0.18] - 2026-04-23
Added
ATS slug override map integrated as lookup-first layer. Reduces probe requests for known companies.
[0.0.17] - 2026-04-20
Removed
Workday and Rippling fallbacks dropped from the ATS cascade. Verification against v0.0.16 returned zero successful Workday results on hubspot.com, salesforce.com, and workday.com. Discovery failed on two (careers pages either bot-blocked or did not expose a Workday URL to a bare fetch) and the myworkdayjobs.com jobs endpoint returned empty postings on the third. The fallbacks were adding 20 to 30 seconds of empty-result runtime per unmatched domain without ever surfacing data.
src/workday.js and src/rippling.js deleted. Imports and Phase 3 cascade removed from main.js. deriveCareerPageUrl no longer branches on workday or rippling.
Changed
ATS scope documented in README as Greenhouse, Lever, and Ashby only. v0.2.0 Workday and Rippling language removed from the overview, features list, supported platforms table, cascade order, and known limitations. Rationale for the removal is included in the overview so future readers understand why the scope was narrowed.
ats_platform output is now constrained to greenhouse, lever, ashby, or null.
Performance
hubspot.com and other domains without a supported ATS drop from ~20s to ~2s of empty-result runtime.
Ramp, Clay, Stripe, Brex unchanged.
[0.0.16] - 2026-04-20
Changed
ATS cascade restructured for speed. Greenhouse, Lever, and Ashby now probe the base slug in parallel instead of sequentially running each one through every slug variant. Variant probing (labs, -labs) only fires on ATS platforms where the base slug returned 404, and only if no base slug succeeded anywhere. Workday and Rippling still run sequentially after the slug-based cascade since they discover slugs from the careers page, not from the domain.
Slug-based scrapers (Greenhouse, Lever, Ashby) now accept a single slug and return {jobs, slug, is404}. Variant orchestration moved into main.js so the three ATS probes can run in parallel.
Performance
Ramp drops from ~13s to ~3s (base slug hit on Ashby, no variant probing needed).
Clay stays at ~7s (base slug 404 on all three, one variant probe finds claylabs on Ashby).
Stripe, Brex, and other happy-path domains unchanged at ~2s.
[0.0.15] - 2026-04-20
Fixed
GTM role matcher no longer pulls in administrative titles that happened to contain GTM keywords. New exclusion pass rejects Executive Assistant, Administrative Assistant, Admin Assistant, Personal Assistant, Assistant to, EA to, Receptionist, Office Manager, Office Coordinator, and Facilities roles before GTM keyword matching runs. Applied centrally in classifyRoles so every ATS adapter benefits.
top_gtm_role selection now uses a seniority-weighted ranker instead of returning the first match by array order. Roles are scored by the highest-ranked title token they contain (Chief / CRO / CMO / CCO / CSO / CFO at 100, Head of / SVP / EVP / VP / Vice President at 80, Director at 60, Senior Manager / Principal / Staff / Lead at 40, Manager / Senior at 20, everything else at 10), and the highest-scoring role wins. Ties break on original array order.
CRO homograph guard on the seniority ranker. CRO stops counting as a Tier 1 keyword when the title also contains website, conversion rate, conversion, seo, digital marketing, landing page, or optimization within 20 characters of cro. The role can still score on other keywords present in the title, so it does not get dropped from matching, just denied the fake Tier 1 boost. Fixes cases like Ramp picking Growth Manager, Website CRO over Head of Customer Success.
ATS slug probe. Scrapers now try base slug plus labs and -labs suffixes before giving up, which fixes companies whose ATS slug differs from their domain (e.g. clay.com uses claylabs on Ashby). New optional ats_slug input lets you pin the slug directly when auto-probing is not enough. Scrapers also now return the slug that actually worked so career_page_url reflects the real board.
[0.2.0] - 2026-04-15
Added
Workday ATS detection: scrapes the company careers page for a Workday board link, extracts the slug and tenant, then queries the Workday jobs JSON API and filters results by GTM keywords
Rippling ATS detection: scrapes the company careers page for a Rippling board link, then queries the Rippling jobs API and filters by GTM keywords
Velocity tracking: three new output fields (role_count_delta, velocity_signal, days_between_runs) derived by comparing the current run against the optional previous_gtm_role_count and previous_run_date inputs
Change detection: four new output fields (change_detected, change_type, new_gtm_roles, removed_gtm_roles) derived by comparing current gtm_roles_detected against the optional previous_gtm_roles_detected input
Three new optional input fields: previous_gtm_role_count, previous_run_date, previous_gtm_roles_detected — all null-safe, first run returns first_run state with no delta computed
ats_platform output field now supports "workday" and "rippling" in addition to existing values
input_schema.json updated with three new optional fields
output_schema.json updated with seven new output fields
Known Limitations
Workday detection is best-effort: coverage estimated at 60-70% due to non-deterministic company slug patterns. Workday slugs do not reliably map to the input domain. The careers page scrape is the only reliable resolution method. Failures return ats_platform = none gracefully.
Rippling detection uses unofficial API endpoints that are not publicly documented. Coverage is lower confidence than Greenhouse, Lever, and Ashby. Failures return ats_platform = none gracefully.
Neither Workday nor Rippling failures affect the core GTM signal output. If both fail, the actor returns ats_platform = none with all GTM fields null rather than erroring out.
[0.1.1] - 2026-04-07
Changed
Price reduced from $0.12/result to $0.07/result for volume strategy
GitHub repo visibility changed to private
[0.1.0] - 2026-04-03
Added
Initial release: Greenhouse, Lever, Ashby ATS cascade via HTTP (no Playwright)