Levels.fyi Scraper
Pricing
from $3.50 / 1,000 results
Levels.fyi Scraper
Scrape verified tech compensation data from Levels.fyi. Extract total compensation, base salary, stock grants, bonuses, levels and vesting schedules for 3,000+ companies including Google, Meta, Amazon, Apple and Microsoft.
Pricing
from $3.50 / 1,000 results
Rating
0.0
(0)
Developer
Haketa
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Levels.fyi Scraper — Tech Salary & Total Compensation Data Extractor for FAANG, Big Tech & Startups
The most reliable Levels.fyi scraper on Apify for downloading verified tech compensation data — base salary, stock grants, bonuses, and total comp by company, level, role, location, and years of experience. Built for candidate negotiation, recruiter intelligence, comp strategy, VC portfolio benchmarking, and equity research on engineering cost trends. A clean, structured Levels.fyi API alternative with no login, no manual copy-paste, and no Cloudflare headaches.
What This Actor Does
The Levels.fyi Scraper is a production-grade Apify actor that turns levels.fyi — the leading crowdsourced tech compensation database — into a clean, queryable JSON feed. Point it at a company slug (e.g. google, meta, amazon, apple, microsoft, netflix, nvidia, stripe, databricks) and a job family slug (e.g. software-engineer, product-manager, data-scientist, product-designer, engineering-manager) and the actor returns structured total compensation records for every level — L3 through L8+, IC4 through Distinguished, E3 through E9 — with base salary, stock grant value, bonus, location, and years of experience broken out.
In a single run you get the kind of structured tech salary data that engineering leaders, recruiters, and equity analysts otherwise build manually by clicking through dozens of Levels.fyi pages and copying numbers into spreadsheets.
The actor returns structured records covering:
- Company-and-role salary pages — e.g. Google Software Engineer, Meta Product Manager, Amazon Data Scientist
- Per-level compensation — L3/L4/L5/L6/L7/L8 (Google), E3/E4/E5/E6/E7/E8/E9 (Meta), SDE I/II/III/Sr/Principal (Amazon), ICT2-ICT6 (Apple), SDE-SDE 2-Senior-Principal-Partner (Microsoft), and equivalents at thousands of other companies
- Total comp components — base salary, annual stock vest value, target bonus, total compensation
- Submission metadata — years of experience, years at company, offer location, offer date (when published)
- Aggregate stats — median TC (p50) and percentile rollups (p10/p25/p50/p75/p90) where Levels.fyi exposes them
- Geographic breakouts — SF Bay Area, Seattle, NYC, Austin, Boston, Toronto, London, Bangalore, Dublin, Tel Aviv, Singapore
Each record carries the source URL and an ISO-8601 scrapedAt timestamp so you can stitch results into time-series dashboards or treat the dataset as an evidentiary log.
Why scrape Levels.fyi yourself when this exists?
Levels.fyi looks "scrapeable" until you actually try. Then you hit:
- Cloudflare bot challenges that reject plain
curland most headless browsers immediately - A Next.js front-end with a rotating buildId that breaks any URL hard-coded against the build hash
- Salary data buried inside
__NEXT_DATA__JSON blobs with shifting key names (averagesvspercentilesvsmedianvssubmissions) - React hydration that makes naive Cheerio scraping return half-empty DOMs
- Aggressive rate-limiting on unauthenticated traffic, especially from cloud IP ranges (AWS, GCP, Azure)
- Markdown endpoints (
/companies/{slug}/salaries/{role}.md) that exist but aren't documented and silently return"can't be found" - Inconsistent salary string formats —
$185k,185,000,$185,000.00,185K USD— that all need normalizing to a single integer - No public Levels.fyi API for outside developers; the internal one requires session cookies and is not for redistribution
- Hand-copying compensation data into a spreadsheet is mind-numbing and goes stale within weeks
This actor solves all of it: a Playwright session that survives Cloudflare, a markdown-first parser that bypasses buildId rotation, three fallback strategies inside __NEXT_DATA__ (averages.samples → percentiles → median), a final HTML-table parser, and normalized integer USD output. No login. No browser farm. No pandas glue code.
Quick Start
One-Click Run
- Click Try for free on the Apify Store page
- Leave
companiesat the default (["google"]) andjobFamiliesat["software-engineer"], or replace with your target list - Optionally set a
locationfilter (e.g.san-francisco-bay-area,seattle,new-york,london) - Hit Start — your dataset of compensation records lands in the dataset view, ready to download as JSON, CSV, Excel, or HTML
API Run (Python)
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")run = client.actor("haketa/levels-fyi-scraper").call(run_input={"companies": ["google", "meta", "amazon", "apple", "microsoft"],"jobFamilies": ["software-engineer"],"location": "san-francisco-bay-area","maxResults": 200,})for r in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{r['company']:<12} {r.get('level','?'):<6} "f"TC ${r.get('totalCompensation') or 0:>7,} "f"Base ${r.get('baseSalary') or 0:>7,} "f"Stock ${r.get('stockGrant') or 0:>7,} "f"YoE {r.get('yearsOfExperience','?')} {r.get('location','')}")
API Run (Node.js / TypeScript)
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });const run = await client.actor('haketa/levels-fyi-scraper').call({companies: ['stripe', 'databricks', 'snowflake'],jobFamilies: ['software-engineer', 'product-manager'],location: 'new-york',maxResults: 150,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Pulled ${items.length} compensation records`);items.filter(r => r.totalCompensation).sort((a, b) => b.totalCompensation - a.totalCompensation).slice(0, 10).forEach(r => console.log(`${r.company} ${r.level} — $${r.totalCompensation.toLocaleString()} (${r.location})`));
API Run (cURL)
curl -X POST "https://api.apify.com/v2/acts/haketa~levels-fyi-scraper/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"companies": ["nvidia", "openai", "anthropic"],"jobFamilies": ["software-engineer", "machine-learning-engineer"],"location": "san-francisco-bay-area","maxResults": 100}'
Direct URL mode
If you already have a Levels.fyi URL — from search results, a shared link, or a manual filter — paste it straight in:
{"urls": ["https://www.levels.fyi/companies/google/salaries/software-engineer","https://www.levels.fyi/companies/meta/salaries/product-manager","https://www.levels.fyi/companies/amazon/salaries/data-scientist?location=seattle"],"maxResults": 250}
When urls is provided, companies and jobFamilies are ignored.
How It Works
Levels.fyi serves three useful URL families:
| Endpoint | Pattern | Used For |
|---|---|---|
| Company × Role page (HTML) | /companies/{company}/salaries/{jobFamily} | Primary salary breakdown by level |
| Company × Role markdown | /companies/{company}/salaries/{jobFamily}.md | Stable structured fallback — no buildId dependency |
| Role overview | /t/{jobFamily} | Cross-company role-level aggregates |
The three-tier extraction pipeline
This scraper combines a real Playwright browser (Chromium, headed, AutomationControlled disabled, webdriver flag patched) with three parsing layers, in priority order:
- Markdown endpoint first (
.mdURL). When Levels.fyi returns a salary table in markdown, this is the fastest, most stable shape — independent of the rotating Next.jsbuildId. The parser walks the markdown table, fuzzy-matches column names (base,salary,total comp,tc,stock,rsu,equity,bonus,yoe,location), and normalizes to integer USD. - HTML page with
__NEXT_DATA__JSON. When markdown is missing or empty, the actor loads the full HTML page, parses the<script id="__NEXT_DATA__">blob, and extracts in fall-through order:pageProps.averages[].samples[](individual submissions per level) →pageProps.percentiles(p10/p25/p50/p75/p90 aggregate) →pageProps.median(single median submission). - HTML table parser. If
__NEXT_DATA__is missing or schema-shifted, Cheerio walks rendered<table>elements, infers headers from the first row, and extracts the same fields.
Engineering details
- Cloudflare-aware fetch loop — waits up to 30 s for "Just a moment…" / "Verify you are human" challenges to resolve, with three exponential-backoff retries per URL
- Anti-detection patches —
navigator.webdriveris forced toundefined,window.chrome.runtimeis stubbed, real Chrome/Windows UA - Apify Proxy by default — residential pool recommended when Cloudflare gets aggressive
- Number normalization —
$185k,185,000,$185K, and$185,000.00all parse to the integer185000 - In-actor per-level percentile computation — Levels.fyi's
__NEXT_DATA__exposes only page-level aggregates, not per-level percentiles. The actor solves this by sorting each level group's TC, base, stock, and bonus arrays and linearly interpolating p10/p25/p50/p75/p90 in-process, then broadcasting the resultingmedianTC/medianBase/medianStock/medianBonus/p10TC/p25TC/p75TC/p90TC/levelSampleCountonto every individual sample row in that level — verified live on Google L3 SWE (28 samples, medianTC $219,850, p25 $192,700, p75 $220,900, p90 $236,400, medianBase $165,000). Zero client-side aggregation needed. - Schema-stable output — every record has the same flat shape regardless of which extraction tier produced it
- Polite delays — configurable
requestDelay(default 2000 ms) with jitter - Append-only dataset writes — re-running the same input pushes fresh time-stamped snapshots, perfect for diff-based change detection
Input Parameters
{"urls": [],"companies": ["google", "meta", "amazon"],"jobFamilies": ["software-engineer", "product-manager"],"location": "san-francisco-bay-area","maxResults": 100,"requestDelay": 2000,"proxyConfiguration": { "useApifyProxy": true }}
Parameter reference
| Parameter | Type | Default | Description |
|---|---|---|---|
urls | array<string> | [] | Direct Levels.fyi URLs. Supports company salary pages, role pages, and .md endpoints. Example: https://www.levels.fyi/companies/google/salaries/software-engineer. When provided, companies and jobFamilies are ignored. |
companies | array<string> | ["google"] | Company slugs from the Levels.fyi URL path. Examples: google, meta, amazon, apple, microsoft, netflix, nvidia, stripe, databricks, snowflake, airbnb, uber, lyft, pinterest, linkedin, salesforce. |
jobFamilies | array<string> | ["software-engineer"] | Job family slugs. Examples: software-engineer, product-manager, data-scientist, product-designer, engineering-manager, machine-learning-engineer, solution-architect, technical-program-manager, data-engineer, hardware-engineer. Leave empty for software-engineer only. |
location | string | "" | Location filter applied after extraction. Matches case-insensitively against the record's location. Examples: san-francisco-bay-area, new-york, seattle, london, bangalore. Empty = no location filter. |
maxResults | integer | 100 | Hard cap on dataset items pushed across all tasks. 0 = unlimited. |
requestDelay | integer | 2000 | Milliseconds to sleep between page fetches (with small jitter added). Set to 0 for the fastest crawl, higher for politeness. Max 10000. |
proxyConfiguration | object | Apify Proxy ON | Standard Apify proxy block. Residential pool is strongly recommended — Levels.fyi sits behind Cloudflare and may challenge datacenter IPs. |
Output Schema
Every record uses one flat shape regardless of which extraction tier produced it.
| Field | Type | Description |
|---|---|---|
company | string | Display company name extracted from page heading (e.g. Google, Meta, Amazon) |
companySlug | string | Lowercased URL slug used by Levels.fyi (e.g. google, meta-platforms) |
jobFamily | string | Role family slug (e.g. software-engineer, product-manager) |
jobTitle | string | null | Raw job title string if present in the row |
level | string | null | Company-specific level label (e.g. L4, L5, E5, IC4, SDE II, Senior) |
baseSalary | number | null | Annual base salary in USD (integer) |
stockGrant | number | null | Annual RSU / equity grant value in USD (integer) |
bonus | number | null | Annual target or signing bonus in USD (integer) |
totalCompensation | number | null | Total annual comp in USD — base + stock + bonus |
yearsOfExperience | number | null | Reported years of professional experience |
yearsAtCompany | number | null | Reported years at the listed company |
location | string | null | Free-text location label (e.g. San Francisco, CA, Seattle, WA, London, UK) |
country | string | null | Country when Levels.fyi exposes it separately |
offerDate | string | null | Submission date string as reported by Levels.fyi |
medianTC | number | null | Per-level median total compensation (p50) in USD — broadcast onto every sample row in the level group |
medianBase | number | null | Per-level median base salary in USD |
medianStock | number | null | Per-level median annualized stock grant in USD |
medianBonus | number | null | Per-level median bonus in USD |
p10TC | number | null | 10th-percentile total compensation for the level group |
p25TC | number | null | 25th-percentile total compensation for the level group |
p75TC | number | null | 75th-percentile total compensation for the level group |
p90TC | number | null | 90th-percentile total compensation for the level group |
levelSampleCount | number | null | Total submissions Levels.fyi has for that (company × level) — sample-size signal for reliability |
levelBreakdown | array | null | Reserved for future per-level percentile arrays |
sourceUrl | string | Full Levels.fyi page URL the record was extracted from |
scrapedAt | string | ISO-8601 timestamp of extraction |
Example: Individual submission (Google L5 SWE, SF Bay Area)
{"company": "Google", "companySlug": "google","jobFamily": "software-engineer", "jobTitle": "Software Engineer","level": "L5","baseSalary": 232000, "stockGrant": 215000, "bonus": 48000,"totalCompensation": 495000,"yearsOfExperience": 7, "yearsAtCompany": 2,"location": "San Francisco, CA", "country": null,"offerDate": "2026-02-14","medianTC": 219850, "medianBase": 165000, "medianStock": 38000, "medianBonus": 16500,"p10TC": 178400, "p25TC": 192700, "p75TC": 220900, "p90TC": 236400,"levelSampleCount": 28,"levelBreakdown": null,"sourceUrl": "https://www.levels.fyi/companies/google/salaries/software-engineer","scrapedAt": "2026-05-18T11:42:08.119Z"}
Every individual sample row in a level group carries the same medianTC, medianBase, medianStock, medianBonus, p10TC/p25TC/p75TC/p90TC, and levelSampleCount — so a single row tells you both this person's offer and where it sits in the level distribution without a second query.
Example: Aggregate percentile rollup (Meta E5, all locations)
{"company": "Meta", "companySlug": "meta","jobFamily": "software-engineer", "jobTitle": null,"level": "E5","baseSalary": 230000, "stockGrant": 240000, "bonus": 60000,"totalCompensation": 530000,"yearsOfExperience": null, "yearsAtCompany": null,"location": "United States", "country": null,"offerDate": null,"medianTC": 530000, "medianBase": 230000, "medianStock": 240000, "medianBonus": 60000,"p10TC": 430000, "p25TC": 480000, "p75TC": 585000, "p90TC": 640000,"levelSampleCount": 142,"levelBreakdown": null,"sourceUrl": "https://www.levels.fyi/companies/meta/salaries/software-engineer","scrapedAt": "2026-05-18T11:42:14.557Z"}
Common Companies, Levels & Job Families
Big Tech & FAANG-equivalent — common level taxonomies
| Company | Slug | IC Level Range | Senior / Staff / Principal |
|---|---|---|---|
| Google / Alphabet | google | L3 → L8 | L5 = Senior · L6 = Staff · L7 = Sr Staff · L8 = Principal |
| Meta (Facebook) | meta | E3 → E9 | E5 = Senior · E6 = Staff · E7 = Sr Staff · E8 = Director |
| Amazon | amazon | SDE I → Principal | SDE III = Senior · Principal SDE · Sr Principal SDE |
| Apple | apple | ICT2 → ICT6 | ICT4 = Senior · ICT5 = Staff · ICT6 = Principal |
| Microsoft | microsoft | 59 → 70 | 63 = Senior · 65 = Principal · 67 = Partner |
| Netflix | netflix | Single-band | Senior SWE · Staff SWE (intentionally flat ladder) |
| Nvidia | nvidia | IC3 → IC7 | IC5 = Senior · IC6 = Principal · IC7 = Distinguished |
| Stripe | stripe | L1 → L5 | L3 = Senior · L4 = Staff · L5 = Principal |
| Databricks | databricks | IC3 → IC7 | IC5 = Senior · IC6 = Staff · IC7 = Principal |
| Airbnb | airbnb | L3 → L7 | L5 = Senior · L6 = Staff · L7 = Principal |
Job family slugs you can pass to jobFamilies
software-engineer · product-manager · data-scientist · data-engineer · machine-learning-engineer · product-designer · engineering-manager · technical-program-manager · solution-architect · hardware-engineer · security-software-engineer · site-reliability-engineer
Tip: When a slug returns no data, Levels.fyi has likely renamed the role. Open the company's page in a browser, copy the slug from the URL, and feed it back in.
Use Cases
Candidate Negotiation & Career Planning
Software engineers, PMs, data scientists, designers, and EMs walk into salary negotiations with hard numbers instead of vibes:
- Offer negotiation in one query — every row carries
medianTC,p25TC, andp75TC, so you can instantly place a new offer in the level's distribution without any client-side aggregation - Salary band benchmarking — read p10/p25/p50/p75/p90 straight off a single record to see the entire compensation distribution at a glance
- Comp gap detection — filter rows where
baseSalary < medianBaseto surface roles or companies where your base is below market (instant motivation to negotiate or move) - Benchmark a competing offer against the median TC at the same company × level × location before signing
- Pinpoint the right level — find out whether the company is offering an L4 number for an L5 scope
- Stack-rank target companies by total comp for your seniority before kicking off an interview loop
- Build a per-component breakdown — base vs RSU vs bonus — so you can talk concretely about cliffs, refreshers, and vesting
Recruiter & Talent Intelligence
In-house recruiters and external search firms use Levels.fyi data to close more offers:
- Recruiter intel with reliability scoring —
levelSampleCountplus the p10/p90 spread tells you at a glance whether a level's distribution is statistically meaningful or built on three submissions - Build market-aligned offer ranges before sourcing, so packages don't get rejected at the verbal-offer stage
- Identify which competitors out-pay you at each level — the source of most candidate losses
- Tailor counter-offers with sourced benchmarks rather than gut feel
- Coach hiring managers on what "Senior" actually costs in SF Bay Area vs Austin vs Toronto
- Map the comp ladder for emerging companies (post-Series-D, post-IPO) where public salary disclosures don't exist yet
Tech HR & Total-Rewards Strategy
Heads of People, comp analysts, and total-rewards consultants use this data to:
- Refresh comp bands quarterly without paying for full Radford / Mercer / Aon survey access every cycle
- Validate Radford / Mercer / OptionImpact against crowdsourced ground truth
- Justify level-realignment proposals to the exec team with external comparables
- Run geographic differentials — what does the same L5 cost in Seattle vs NYC vs Boston vs London vs Bangalore?
- Model the cost of a level promotion at scale across an engineering org
VC Portfolio Salary Benchmarking
Operating partners, portfolio talent leads, and platform teams at VC funds use this dataset to:
- Build a portfolio-wide comp benchmark — "what does a Series B startup pay a Staff Engineer in 2026?"
- Push back on founder offers that are over or under market and burn runway prematurely
- Compare cash vs equity weighting between portfolio companies and Big Tech
- Inform secondaries and tender-offer pricing with current comp comparables
- Brief LP updates with hard data on how engineering-cost trends are reshaping burn
Founder Hiring Plan Budgeting
Early-stage and growth-stage founders use this scraper to:
- Build a defensible engineering payroll model for the next 18 months at seed / Series A / Series B
- Decide where to base the team — a Toronto or Bangalore satellite can cut average TC 40-60% vs SF-only
- Set founding-engineer equity bands by reverse-engineering the Big Tech alternative a candidate is walking away from
- Plan funding rounds around realistic salary requirements, not optimistic assumptions
Tech Journalism & Comp Reporting
Reporters at The Information, Bloomberg Tech, Business Insider, Wired, Rest of World, and independent newsletters use Levels.fyi data to:
- Source "what FAANG pays" features with up-to-date, attribution-friendly aggregate figures
- Track the comp impact of layoffs / hiring freezes / RIFs across companies
- Compare AI-lab compensation (OpenAI, Anthropic, DeepMind, xAI) against incumbent Big Tech
- Quantify the post-IPO comp reset when high-flying private companies go public
Equity Research — Engineering Cost Trends
Sell-side and buy-side analysts covering Big Tech use this dataset as a forward indicator of:
- Track median TC drift over time — schedule weekly runs and diff
medianTCper (company × level) to catch comp resets, refreshers, and quiet pay cuts months before earnings calls - R&D opex pressure at MSFT, GOOG, META, AAPL, AMZN, NVDA before it shows up in 10-Qs
- Talent flight signals — when a target company's median TC drifts below peers, retention erodes
- Capacity buildouts — new ML-engineer hiring at AI labs predicts future cloud and compute commitments
- Geographic restructuring — Bangalore / Dublin / Toronto hub growth vs SF / Seattle contraction
- Stock-comp dilution forecasts — pairing TC with grant-vesting schedules surfaces future dilution
Compensation Transparency Advocacy
Pay-transparency activists, employee resource groups, and DEI-comp consultants use this data to:
- Document pay disparities between in-office and remote, US and international hires
- Support state-level pay-transparency laws (CA, CO, WA, NY) with empirical benchmark data
- Identify under-leveling patterns — promotions delayed relative to market for specific groups
- Provide free benchmarking to underrepresented candidates who otherwise lack negotiation data
Sample Queries & Recipes
Recipe 1: Compare FAANG SWE comp in SF Bay Area
{"companies": ["google", "meta", "amazon", "apple", "netflix"],"jobFamilies": ["software-engineer"],"location": "san-francisco-bay-area","maxResults": 200}
Recipe 2: AI-lab compensation benchmark
{"companies": ["openai", "anthropic", "google-deepmind", "xai", "cohere"],"jobFamilies": ["machine-learning-engineer", "software-engineer"],"maxResults": 150}
Recipe 3: Senior PM comp across post-IPO unicorns
{"companies": ["stripe", "databricks", "snowflake", "airbnb", "uber", "lyft"],"jobFamilies": ["product-manager"],"maxResults": 200}
Recipe 4: Geographic comp differential — same role, five cities
{"companies": ["google"],"jobFamilies": ["software-engineer"],"location": "seattle","maxResults": 50}
Run separately with location set to san-francisco-bay-area, new-york, austin, london, and bangalore — then diff in your spreadsheet to quantify the geographic premium.
Recipe 5: Engineering Manager ladder at Big Tech
{"companies": ["google", "meta", "amazon", "microsoft", "apple"],"jobFamilies": ["engineering-manager"],"maxResults": 200}
Recipe 6: Direct URL crawl — preserve manual filters
{"urls": ["https://www.levels.fyi/companies/google/salaries/software-engineer?location=seattle","https://www.levels.fyi/companies/meta/salaries/product-manager?location=new-york","https://www.levels.fyi/companies/amazon/salaries/data-scientist?location=austin"],"maxResults": 300}
Recipe 7: Sample run — sanity-check 10 records before a full crawl
{"companies": ["google"],"jobFamilies": ["software-engineer"],"maxResults": 10}
Integration Examples
Google Sheets
- Schedule the actor weekly in Apify
- Add the Export to Google Sheets integration
- Open the sheet —
=AVERAGE(...), pivot tables, and charts work directly on the typed numeric fields
Make.com / Zapier / n8n
Use the Apify connector. Useful triggers:
- New median TC delta > $20k week-over-week for a watchlist company
- New record where
level = "L7"andtotalCompensation > $1M - Geographic-pay-gap watch — first SF record under the 25th percentile of NYC
- Job-family expansion — new submissions for a role that previously had none
Power BI / Tableau / Looker
Wire the Apify dataset REST endpoint as a data source. Build dashboards covering median TC by company × level, geographic premium curves (SF vs Seattle vs NYC vs Toronto vs London vs Bangalore), base-vs-stock-vs-bonus mix shifts over time, YoE-to-TC scatter plots, and per-company comp ladder steepness (L3→L8 multiplier).
Postgres / Snowflake / BigQuery
Use the Apify webhook integration to POST run results to your ingest endpoint. Suggested schema:
CREATE TABLE levels_fyi_comp (id BIGSERIAL PRIMARY KEY,company TEXT, company_slug TEXT, job_family TEXT, level TEXT,base_salary NUMERIC, stock_grant NUMERIC, bonus NUMERIC, total_compensation NUMERIC,years_of_experience NUMERIC, years_at_company NUMERIC,location TEXT, country TEXT, offer_date DATE, median_tc NUMERIC,source_url TEXT, scraped_at TIMESTAMPTZ);CREATE INDEX idx_company_level ON levels_fyi_comp (company_slug, level);CREATE INDEX idx_scraped_at ON levels_fyi_comp (scraped_at);
Greenhouse / Lever / Ashby / Workday — ATS Enrichment
When a candidate moves to "Offer" stage, fire a webhook that looks up the candidate's current employer × role × level via this actor and pre-populates the comp-justification field with median TC.
Salesforce / HubSpot — Recruiter CRM
Tag prospective hires with the median TC at their current company × level. Recruiters see the number in the contact view and stop sending insulting offers.
Major Tech Hubs Covered
| Metro / Region | Why It Matters | Typical Senior SWE TC Range (USD) |
|---|---|---|
| San Francisco Bay Area | Global epicenter — highest comp band, deepest data | $400k–$700k |
| Seattle | AMZN/MSFT/Meta presence, no state income tax | $360k–$600k |
| New York City | Finance + tech overlap (HFT, Bloomberg, Two Sigma) | $380k–$650k |
| Austin | Texas hub for AAPL, GOOG, Tesla, Indeed | $300k–$500k |
| Boston | HubSpot, Wayfair, Toast, Klaviyo, biotech-tech | $280k–$480k |
| Los Angeles | Snap, Disney, Riot, SpaceX | $300k–$520k |
| Toronto | Largest Canadian hub — Shopify, Wealthsimple, Big Tech satellites | CAD 200k–CAD 350k |
| London | European HQ for most Big Tech | GBP 130k–GBP 280k |
| Dublin | EU tax-favored HQ for Google, Meta, Apple, Stripe | EUR 110k–EUR 220k |
| Berlin | SAP, N26, Delivery Hero, Zalando, Big Tech R&D | EUR 90k–EUR 180k |
| Tel Aviv | World-class R&D centers for AAPL, GOOG, MSFT, META, NVDA | NIS 600k–NIS 1.1M |
| Bangalore | Largest non-US Big Tech hub — Google, Meta, AMZN, MSFT | INR 70 LPA–INR 200+ LPA |
| Singapore | APAC HQ for Stripe, Meta, ByteDance, Shopee | SGD 180k–SGD 350k |
| Sydney | Atlassian, Canva, Google AU, AWS APAC | AUD 220k–AUD 400k |
All numeric examples are illustrative — actual values are scraped live from Levels.fyi at run time.
Cost & Performance
| Metric | Value |
|---|---|
| Engine | Playwright (Chromium, headed, anti-detection) + Cheerio fallback |
| Runtime (single company × role) | 20–60 seconds |
| Runtime (10 companies × 1 role) | 4–10 minutes |
| Runtime (50 companies × 3 roles) | 30–60 minutes |
| Cost per run | Pay-per-event — typically pennies for a single company, low single-digit dollars for large multi-company crawls |
| Pricing model | Pay-per-event (transparent per-record billing) |
| Data freshness | Live at run time — reflects the latest submissions visible on Levels.fyi |
| Auth required | None |
| Proxy required | Apify Proxy enabled by default; residential pool recommended when Cloudflare gets aggressive |
| Concurrency | Safe to run multiple parallel configurations with different company/role splits |
| Memory footprint | 1024 MB minimum, 4096 MB recommended for large multi-company crawls |
Compliance, Privacy & Legal Notes
- Public data only — every field returned is published openly on levels.fyi and visible to any web visitor without authentication
- No PII — Levels.fyi submissions are crowdsourced anonymously by definition; the dataset contains no names, emails, phone numbers, employee IDs, or contact info
- No PHI / financial-account data — only compensation figures, levels, and high-level offer metadata
- Respect Levels.fyi's Terms of Service — keep crawl volume modest, use
requestDelay, do not redistribute the dataset commercially without verifying ToS, do not rebuild the site as a clone - GDPR / CCPA — submissions are anonymous and do not constitute personal data under most jurisdictions, but the data consumer is responsible for compliance with applicable privacy law
- Attribution — cite Levels.fyi when republishing aggregate figures publicly
- Negotiation use — using crowdsourced comp data in personal salary negotiation is, in nearly every jurisdiction, completely legal and increasingly expected
Important: This actor is built for research, benchmarking, transparency, and individual negotiation support. Do not use it for any purpose that would breach Levels.fyi's ToS or applicable law.
Frequently Asked Questions
How fresh is the data?
Live at run time. Each run hits Levels.fyi directly and parses whatever the site currently shows. Levels.fyi accepts new compensation submissions continuously, so daily or weekly schedules will capture the latest entries.
How many records will I get per company × role?
It depends on how many anonymous submissions Levels.fyi has for that page. Popular pages (Google SWE, Meta SWE, Amazon SWE) often return dozens of individual submissions plus aggregate percentiles. Niche pages may return only a percentile rollup or a single median record. The maxResults cap applies across all tasks.
Does this require a Levels.fyi login or API key, and does Cloudflare block it?
No login or API key — Levels.fyi has no public API, and this actor uses no session cookie. Cloudflare is handled by a real Playwright Chromium browser with anti-detection patches and exponential-backoff retries. If a specific Apify Proxy IP gets challenged repeatedly, increase requestDelay or switch the proxy group to residential.
Can I scrape compensation data outside the US?
Yes. Levels.fyi has international submissions for London, Dublin, Berlin, Amsterdam, Tel Aviv, Toronto, Vancouver, Bangalore, Hyderabad, Singapore, Tokyo, Sydney, and more. Either pass an international location filter or hit URLs that already carry the ?location= query parameter via the urls input.
Are individual submitters identified?
No. Levels.fyi submissions are anonymous; the site does not publish names, emails, or contact info, and neither does this scraper.
Can I get a complete dump of every company on Levels.fyi?
The actor does not auto-discover companies — Levels.fyi covers 3,000+ companies and one-shotting them all is impractical. Maintain your own watchlist (S&P 500 tech, your portfolio companies, your competitor set) and pass slugs via companies.
Can I get historical data?
Levels.fyi shows current submissions. To build a historical series, schedule this actor weekly or monthly and persist each run's dataset to your warehouse — Apify retains all run datasets indefinitely on most plans.
What's the difference between an individual submission and a percentile rollup?
When Levels.fyi exposes per-submission samples for a level, you get one record per submission with YoE, location, and offer date. When only aggregates are published, you get a single rollup record with medianTC set to the p50 figure.
Does each row include the salary median?
Yes. Every sample row carries its level group's full distribution — medianTC, medianBase, medianStock, medianBonus, plus p10TC, p25TC, p75TC, p90TC, and levelSampleCount — computed in-actor from the level's submissions array. A single record is enough to place any offer in context:
{"company": "Google", "level": "L3","baseSalary": 165000, "stockGrant": 38000, "bonus": 16500,"totalCompensation": 219500,"medianTC": 219850, "medianBase": 165000, "medianStock": 38000, "medianBonus": 16500,"p10TC": 178400, "p25TC": 192700, "p75TC": 220900, "p90TC": 236400,"levelSampleCount": 28}
That means SQL like WHERE totalCompensation < p25TC (under-market) or WHERE totalCompensation > p75TC (top-quartile offers) works directly against the dataset — no GROUP BY required.
Can I filter by salary range, YoE, or specific location?
Apply range/YoE filters downstream — SQL WHERE, Python list comprehension, Sheets filter, Power BI slicer — on the typed numeric fields. The location input is a case-insensitive substring match against the record's location field (set seattle and you'll get "Seattle, WA" and "Greater Seattle Area"). For source-side filters, pass a Levels.fyi URL with the right ?location= query parameter via urls.
What if I want the role overview page instead of company × role?
Pass a /t/{jobFamily} URL directly via urls — for example https://www.levels.fyi/t/software-engineer. The extraction pipeline returns whatever Levels.fyi exposes in __NEXT_DATA__ for that page.
Does this work on the Apify Free Plan, and can I schedule it?
Yes to both — full functionality on the free tier (pay-per-event means a test run costs a few cents), and Apify's built-in Scheduler supports any cron expression. Pair with a Google Sheets / webhook / Postgres integration for fully automated comp refresh.
What export formats are supported?
JSON, CSV, Excel (XLSX), HTML, XML, RSS, and JSON Lines — directly from the Apify dataset view or the REST API.
How are errors and partial results handled?
When a single company × role page fails (Cloudflare hard-blocks, slug 404s, the role doesn't exist on that company), the actor logs a warning and moves on. Successfully scraped records are still pushed. Inspect the run log for the exact label of failures.
Why are some records missing level or yearsOfExperience?
Aggregate percentile records (pageProps.percentiles fall-through) and some legacy submissions don't expose those fields. The schema returns null rather than dropping the record, so the TC / base / stock figures remain usable.
Is this a Levels.fyi API alternative?
For read-only public compensation data, yes — this is the practical alternative when you need structured Levels.fyi data programmatically without a documented public API. For commercial redistribution, contact Levels.fyi directly.
Related Apify Actors by Haketa
If you're building tech-comp, talent, or HR-intelligence pipelines, these companion actors are useful next stops:
- H1B Visa Database Scraper — broad-industry US salary benchmarks beyond tech (HR, finance, ops, healthcare, sales) for sanity-checking Levels.fyi numbers against non-tech comparables
- SEEK Scraper (Australia / New Zealand) — APAC job ads and posted salary ranges to complement Levels.fyi's lighter ANZ coverage
- BBB Business Scraper — Better Business Bureau records, useful for vetting smaller employers that show up in Levels.fyi submissions
- Apartments.com Scraper (US) — pair tech salary data with rent data to compute true take-home value in SF, Seattle, NYC, Austin, Boston
- Rent.com Scraper (US) — same idea, secondary rent source
- Realtor.ca Scraper (Canada) — for Toronto / Vancouver cost-of-living overlays on Canadian tech offers
- ProductHunt Launches & Makers Scraper — daily startup launches, makers, votes & reviews — VC/founder/recruiter intel
- SAM.gov Federal Contractor Entity Scraper — defense-tech and gov-contractor employer intel (where comp data is otherwise opaque)
- Domain.com.au Property Scraper (Australia) — Sydney / Melbourne housing for ANZ tech-package modeling
Comparison vs. Alternatives
| Approach | Setup time | Data freshness | Cost (1 company, ~30 records) | Cloudflare handled | Schema normalization | Schedulable |
|---|---|---|---|---|---|---|
| This actor | < 1 minute | Live | Pennies | Built-in | Built-in | Yes (Apify scheduler) |
| Manual Levels.fyi browsing + spreadsheet copy | Hours per company | Stale immediately | Free | N/A | DIY | No |
| Custom Playwright script | Days | Live | Free + infra + maintenance | DIY | DIY | DIY |
| Paid comp-survey subscription (Radford, Mercer, Aon) | Weeks (procurement) | Quarterly | $20k–$100k+ /yr | N/A | Yes | Yes |
| Internal recruiter heads-up calls | Days per data point | Anecdotal | Soft-cost | N/A | None | No |
| Levels.fyi Premium / End Game | Minutes | Live | Per-user subscription | N/A | UI only | No |
Why Pay-Per-Event Pricing?
This actor uses Apify's pay-per-event pricing model, which means:
- You only pay when the actor actually runs and returns records
- Charges scale with how many compensation records you actually consume
- No monthly subscription, no recurring minimum
- Transparent line-item billing inside the Apify Console
- Free to evaluate — run with
maxResults: 5for a few cents before committing to a bigger crawl
Changelog
| Version | Date | Notes |
|---|---|---|
| 1.1 | 2026-05 | Added in-actor per-level percentile computation: every sample row now carries medianTC, medianBase, medianStock, medianBonus, p10TC/p25TC/p75TC/p90TC, and levelSampleCount — interpolated from each level group's submissions array |
| 1.0 | 2026-05 | Initial public release — Playwright Cloudflare-aware fetch, markdown-first parser, three-tier __NEXT_DATA__ fallback (averages.samples → percentiles → median), HTML-table parser, normalized integer USD output, configurable requestDelay, location filter, direct-URL mode |
Keywords
Levels.fyi scraper · Levels.fyi API alternative · Levels.fyi data extraction · tech salary data · tech compensation scraper · software engineer compensation scraper · FAANG comp data · total compensation benchmarking · tech offer letter data · L4 L5 L6 comp data · L7 L8 staff principal engineer salary · E5 E6 E7 Meta compensation · Google SWE compensation data · Meta SWE comp scraper · Amazon SDE salary data · Apple ICT compensation · Microsoft 63 65 67 levels · Netflix SWE salary · Nvidia ML engineer comp · Stripe Databricks Snowflake comp data · base salary stock grant bonus scraper · RSU vesting schedule data · tech recruiter intelligence · tech HR comp strategy data · VC portfolio salary benchmarking · founder hiring plan budgeting · tech journalism comp reporting · equity research engineering cost · pay transparency tech data · SF Bay Area tech salary · Seattle tech salary · NYC tech salary · Austin tech salary · Boston tech salary · Toronto tech salary · London tech salary · Bangalore tech salary · machine learning engineer compensation · product manager compensation scraper · data scientist comp benchmarking · engineering manager comp data · technical program manager salary · Apify Levels.fyi actor · tech salary negotiation data · crowdsourced compensation API · tech compensation transparency · annual TC scraper · stock refresher RSU benchmark
Support
- Bug reports: Use the Issues tab on the Apify Store page
- Feature requests: Same place — describe the company, role, or comp dimension you need
- Direct contact: Through the Apify developer profile
If this actor saves you an evening of clicking around Levels.fyi or rescues a salary negotiation, a 5-star rating on the Apify Store helps other engineers, recruiters, and comp analysts find it. Thank you.