Levels.fyi Scraper avatar

Levels.fyi Scraper

Pricing

from $3.50 / 1,000 results

Go to Apify Store
Levels.fyi Scraper

Levels.fyi Scraper

Scrape verified tech compensation data from Levels.fyi. Extract total compensation, base salary, stock grants, bonuses, levels and vesting schedules for 3,000+ companies including Google, Meta, Amazon, Apple and Microsoft.

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

Haketa

Haketa

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

a day ago

Last modified

Share

Levels.fyi Scraper — Tech Salary & Total Compensation Data Extractor for FAANG, Big Tech & Startups

The most reliable Levels.fyi scraper on Apify for downloading verified tech compensation data — base salary, stock grants, bonuses, and total comp by company, level, role, location, and years of experience. Built for candidate negotiation, recruiter intelligence, comp strategy, VC portfolio benchmarking, and equity research on engineering cost trends. A clean, structured Levels.fyi API alternative with no login, no manual copy-paste, and no Cloudflare headaches.

Apify Actor


What This Actor Does

The Levels.fyi Scraper is a production-grade Apify actor that turns levels.fyi — the leading crowdsourced tech compensation database — into a clean, queryable JSON feed. Point it at a company slug (e.g. google, meta, amazon, apple, microsoft, netflix, nvidia, stripe, databricks) and a job family slug (e.g. software-engineer, product-manager, data-scientist, product-designer, engineering-manager) and the actor returns structured total compensation records for every level — L3 through L8+, IC4 through Distinguished, E3 through E9 — with base salary, stock grant value, bonus, location, and years of experience broken out.

In a single run you get the kind of structured tech salary data that engineering leaders, recruiters, and equity analysts otherwise build manually by clicking through dozens of Levels.fyi pages and copying numbers into spreadsheets.

The actor returns structured records covering:

  • Company-and-role salary pages — e.g. Google Software Engineer, Meta Product Manager, Amazon Data Scientist
  • Per-level compensation — L3/L4/L5/L6/L7/L8 (Google), E3/E4/E5/E6/E7/E8/E9 (Meta), SDE I/II/III/Sr/Principal (Amazon), ICT2-ICT6 (Apple), SDE-SDE 2-Senior-Principal-Partner (Microsoft), and equivalents at thousands of other companies
  • Total comp components — base salary, annual stock vest value, target bonus, total compensation
  • Submission metadata — years of experience, years at company, offer location, offer date (when published)
  • Aggregate stats — median TC (p50) and percentile rollups (p10/p25/p50/p75/p90) where Levels.fyi exposes them
  • Geographic breakouts — SF Bay Area, Seattle, NYC, Austin, Boston, Toronto, London, Bangalore, Dublin, Tel Aviv, Singapore

Each record carries the source URL and an ISO-8601 scrapedAt timestamp so you can stitch results into time-series dashboards or treat the dataset as an evidentiary log.

Why scrape Levels.fyi yourself when this exists?

Levels.fyi looks "scrapeable" until you actually try. Then you hit:

  • Cloudflare bot challenges that reject plain curl and most headless browsers immediately
  • A Next.js front-end with a rotating buildId that breaks any URL hard-coded against the build hash
  • Salary data buried inside __NEXT_DATA__ JSON blobs with shifting key names (averages vs percentiles vs median vs submissions)
  • React hydration that makes naive Cheerio scraping return half-empty DOMs
  • Aggressive rate-limiting on unauthenticated traffic, especially from cloud IP ranges (AWS, GCP, Azure)
  • Markdown endpoints (/companies/{slug}/salaries/{role}.md) that exist but aren't documented and silently return "can't be found"
  • Inconsistent salary string formats — $185k, 185,000, $185,000.00, 185K USD — that all need normalizing to a single integer
  • No public Levels.fyi API for outside developers; the internal one requires session cookies and is not for redistribution
  • Hand-copying compensation data into a spreadsheet is mind-numbing and goes stale within weeks

This actor solves all of it: a Playwright session that survives Cloudflare, a markdown-first parser that bypasses buildId rotation, three fallback strategies inside __NEXT_DATA__ (averages.samplespercentilesmedian), a final HTML-table parser, and normalized integer USD output. No login. No browser farm. No pandas glue code.


Quick Start

One-Click Run

  1. Click Try for free on the Apify Store page
  2. Leave companies at the default (["google"]) and jobFamilies at ["software-engineer"], or replace with your target list
  3. Optionally set a location filter (e.g. san-francisco-bay-area, seattle, new-york, london)
  4. Hit Start — your dataset of compensation records lands in the dataset view, ready to download as JSON, CSV, Excel, or HTML

API Run (Python)

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("haketa/levels-fyi-scraper").call(run_input={
"companies": ["google", "meta", "amazon", "apple", "microsoft"],
"jobFamilies": ["software-engineer"],
"location": "san-francisco-bay-area",
"maxResults": 200,
})
for r in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{r['company']:<12} {r.get('level','?'):<6} "
f"TC ${r.get('totalCompensation') or 0:>7,} "
f"Base ${r.get('baseSalary') or 0:>7,} "
f"Stock ${r.get('stockGrant') or 0:>7,} "
f"YoE {r.get('yearsOfExperience','?')} {r.get('location','')}")

API Run (Node.js / TypeScript)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('haketa/levels-fyi-scraper').call({
companies: ['stripe', 'databricks', 'snowflake'],
jobFamilies: ['software-engineer', 'product-manager'],
location: 'new-york',
maxResults: 150,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Pulled ${items.length} compensation records`);
items
.filter(r => r.totalCompensation)
.sort((a, b) => b.totalCompensation - a.totalCompensation)
.slice(0, 10)
.forEach(r => console.log(`${r.company} ${r.level} — $${r.totalCompensation.toLocaleString()} (${r.location})`));

API Run (cURL)

curl -X POST "https://api.apify.com/v2/acts/haketa~levels-fyi-scraper/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"companies": ["nvidia", "openai", "anthropic"],
"jobFamilies": ["software-engineer", "machine-learning-engineer"],
"location": "san-francisco-bay-area",
"maxResults": 100
}'

Direct URL mode

If you already have a Levels.fyi URL — from search results, a shared link, or a manual filter — paste it straight in:

{
"urls": [
"https://www.levels.fyi/companies/google/salaries/software-engineer",
"https://www.levels.fyi/companies/meta/salaries/product-manager",
"https://www.levels.fyi/companies/amazon/salaries/data-scientist?location=seattle"
],
"maxResults": 250
}

When urls is provided, companies and jobFamilies are ignored.


How It Works

Levels.fyi serves three useful URL families:

EndpointPatternUsed For
Company × Role page (HTML)/companies/{company}/salaries/{jobFamily}Primary salary breakdown by level
Company × Role markdown/companies/{company}/salaries/{jobFamily}.mdStable structured fallback — no buildId dependency
Role overview/t/{jobFamily}Cross-company role-level aggregates

The three-tier extraction pipeline

This scraper combines a real Playwright browser (Chromium, headed, AutomationControlled disabled, webdriver flag patched) with three parsing layers, in priority order:

  1. Markdown endpoint first (.md URL). When Levels.fyi returns a salary table in markdown, this is the fastest, most stable shape — independent of the rotating Next.js buildId. The parser walks the markdown table, fuzzy-matches column names (base, salary, total comp, tc, stock, rsu, equity, bonus, yoe, location), and normalizes to integer USD.
  2. HTML page with __NEXT_DATA__ JSON. When markdown is missing or empty, the actor loads the full HTML page, parses the <script id="__NEXT_DATA__"> blob, and extracts in fall-through order: pageProps.averages[].samples[] (individual submissions per level) → pageProps.percentiles (p10/p25/p50/p75/p90 aggregate) → pageProps.median (single median submission).
  3. HTML table parser. If __NEXT_DATA__ is missing or schema-shifted, Cheerio walks rendered <table> elements, infers headers from the first row, and extracts the same fields.

Engineering details

  • Cloudflare-aware fetch loop — waits up to 30 s for "Just a moment…" / "Verify you are human" challenges to resolve, with three exponential-backoff retries per URL
  • Anti-detection patchesnavigator.webdriver is forced to undefined, window.chrome.runtime is stubbed, real Chrome/Windows UA
  • Apify Proxy by default — residential pool recommended when Cloudflare gets aggressive
  • Number normalization$185k, 185,000, $185K, and $185,000.00 all parse to the integer 185000
  • In-actor per-level percentile computation — Levels.fyi's __NEXT_DATA__ exposes only page-level aggregates, not per-level percentiles. The actor solves this by sorting each level group's TC, base, stock, and bonus arrays and linearly interpolating p10/p25/p50/p75/p90 in-process, then broadcasting the resulting medianTC / medianBase / medianStock / medianBonus / p10TC / p25TC / p75TC / p90TC / levelSampleCount onto every individual sample row in that level — verified live on Google L3 SWE (28 samples, medianTC $219,850, p25 $192,700, p75 $220,900, p90 $236,400, medianBase $165,000). Zero client-side aggregation needed.
  • Schema-stable output — every record has the same flat shape regardless of which extraction tier produced it
  • Polite delays — configurable requestDelay (default 2000 ms) with jitter
  • Append-only dataset writes — re-running the same input pushes fresh time-stamped snapshots, perfect for diff-based change detection

Input Parameters

{
"urls": [],
"companies": ["google", "meta", "amazon"],
"jobFamilies": ["software-engineer", "product-manager"],
"location": "san-francisco-bay-area",
"maxResults": 100,
"requestDelay": 2000,
"proxyConfiguration": { "useApifyProxy": true }
}

Parameter reference

ParameterTypeDefaultDescription
urlsarray<string>[]Direct Levels.fyi URLs. Supports company salary pages, role pages, and .md endpoints. Example: https://www.levels.fyi/companies/google/salaries/software-engineer. When provided, companies and jobFamilies are ignored.
companiesarray<string>["google"]Company slugs from the Levels.fyi URL path. Examples: google, meta, amazon, apple, microsoft, netflix, nvidia, stripe, databricks, snowflake, airbnb, uber, lyft, pinterest, linkedin, salesforce.
jobFamiliesarray<string>["software-engineer"]Job family slugs. Examples: software-engineer, product-manager, data-scientist, product-designer, engineering-manager, machine-learning-engineer, solution-architect, technical-program-manager, data-engineer, hardware-engineer. Leave empty for software-engineer only.
locationstring""Location filter applied after extraction. Matches case-insensitively against the record's location. Examples: san-francisco-bay-area, new-york, seattle, london, bangalore. Empty = no location filter.
maxResultsinteger100Hard cap on dataset items pushed across all tasks. 0 = unlimited.
requestDelayinteger2000Milliseconds to sleep between page fetches (with small jitter added). Set to 0 for the fastest crawl, higher for politeness. Max 10000.
proxyConfigurationobjectApify Proxy ONStandard Apify proxy block. Residential pool is strongly recommended — Levels.fyi sits behind Cloudflare and may challenge datacenter IPs.

Output Schema

Every record uses one flat shape regardless of which extraction tier produced it.

FieldTypeDescription
companystringDisplay company name extracted from page heading (e.g. Google, Meta, Amazon)
companySlugstringLowercased URL slug used by Levels.fyi (e.g. google, meta-platforms)
jobFamilystringRole family slug (e.g. software-engineer, product-manager)
jobTitlestring | nullRaw job title string if present in the row
levelstring | nullCompany-specific level label (e.g. L4, L5, E5, IC4, SDE II, Senior)
baseSalarynumber | nullAnnual base salary in USD (integer)
stockGrantnumber | nullAnnual RSU / equity grant value in USD (integer)
bonusnumber | nullAnnual target or signing bonus in USD (integer)
totalCompensationnumber | nullTotal annual comp in USD — base + stock + bonus
yearsOfExperiencenumber | nullReported years of professional experience
yearsAtCompanynumber | nullReported years at the listed company
locationstring | nullFree-text location label (e.g. San Francisco, CA, Seattle, WA, London, UK)
countrystring | nullCountry when Levels.fyi exposes it separately
offerDatestring | nullSubmission date string as reported by Levels.fyi
medianTCnumber | nullPer-level median total compensation (p50) in USD — broadcast onto every sample row in the level group
medianBasenumber | nullPer-level median base salary in USD
medianStocknumber | nullPer-level median annualized stock grant in USD
medianBonusnumber | nullPer-level median bonus in USD
p10TCnumber | null10th-percentile total compensation for the level group
p25TCnumber | null25th-percentile total compensation for the level group
p75TCnumber | null75th-percentile total compensation for the level group
p90TCnumber | null90th-percentile total compensation for the level group
levelSampleCountnumber | nullTotal submissions Levels.fyi has for that (company × level) — sample-size signal for reliability
levelBreakdownarray | nullReserved for future per-level percentile arrays
sourceUrlstringFull Levels.fyi page URL the record was extracted from
scrapedAtstringISO-8601 timestamp of extraction

Example: Individual submission (Google L5 SWE, SF Bay Area)

{
"company": "Google", "companySlug": "google",
"jobFamily": "software-engineer", "jobTitle": "Software Engineer",
"level": "L5",
"baseSalary": 232000, "stockGrant": 215000, "bonus": 48000,
"totalCompensation": 495000,
"yearsOfExperience": 7, "yearsAtCompany": 2,
"location": "San Francisco, CA", "country": null,
"offerDate": "2026-02-14",
"medianTC": 219850, "medianBase": 165000, "medianStock": 38000, "medianBonus": 16500,
"p10TC": 178400, "p25TC": 192700, "p75TC": 220900, "p90TC": 236400,
"levelSampleCount": 28,
"levelBreakdown": null,
"sourceUrl": "https://www.levels.fyi/companies/google/salaries/software-engineer",
"scrapedAt": "2026-05-18T11:42:08.119Z"
}

Every individual sample row in a level group carries the same medianTC, medianBase, medianStock, medianBonus, p10TC/p25TC/p75TC/p90TC, and levelSampleCount — so a single row tells you both this person's offer and where it sits in the level distribution without a second query.

Example: Aggregate percentile rollup (Meta E5, all locations)

{
"company": "Meta", "companySlug": "meta",
"jobFamily": "software-engineer", "jobTitle": null,
"level": "E5",
"baseSalary": 230000, "stockGrant": 240000, "bonus": 60000,
"totalCompensation": 530000,
"yearsOfExperience": null, "yearsAtCompany": null,
"location": "United States", "country": null,
"offerDate": null,
"medianTC": 530000, "medianBase": 230000, "medianStock": 240000, "medianBonus": 60000,
"p10TC": 430000, "p25TC": 480000, "p75TC": 585000, "p90TC": 640000,
"levelSampleCount": 142,
"levelBreakdown": null,
"sourceUrl": "https://www.levels.fyi/companies/meta/salaries/software-engineer",
"scrapedAt": "2026-05-18T11:42:14.557Z"
}

Common Companies, Levels & Job Families

Big Tech & FAANG-equivalent — common level taxonomies

CompanySlugIC Level RangeSenior / Staff / Principal
Google / AlphabetgoogleL3 → L8L5 = Senior · L6 = Staff · L7 = Sr Staff · L8 = Principal
Meta (Facebook)metaE3 → E9E5 = Senior · E6 = Staff · E7 = Sr Staff · E8 = Director
AmazonamazonSDE I → PrincipalSDE III = Senior · Principal SDE · Sr Principal SDE
AppleappleICT2 → ICT6ICT4 = Senior · ICT5 = Staff · ICT6 = Principal
Microsoftmicrosoft59 → 7063 = Senior · 65 = Principal · 67 = Partner
NetflixnetflixSingle-bandSenior SWE · Staff SWE (intentionally flat ladder)
NvidianvidiaIC3 → IC7IC5 = Senior · IC6 = Principal · IC7 = Distinguished
StripestripeL1 → L5L3 = Senior · L4 = Staff · L5 = Principal
DatabricksdatabricksIC3 → IC7IC5 = Senior · IC6 = Staff · IC7 = Principal
AirbnbairbnbL3 → L7L5 = Senior · L6 = Staff · L7 = Principal

Job family slugs you can pass to jobFamilies

software-engineer · product-manager · data-scientist · data-engineer · machine-learning-engineer · product-designer · engineering-manager · technical-program-manager · solution-architect · hardware-engineer · security-software-engineer · site-reliability-engineer

Tip: When a slug returns no data, Levels.fyi has likely renamed the role. Open the company's page in a browser, copy the slug from the URL, and feed it back in.


Use Cases

Candidate Negotiation & Career Planning

Software engineers, PMs, data scientists, designers, and EMs walk into salary negotiations with hard numbers instead of vibes:

  • Offer negotiation in one query — every row carries medianTC, p25TC, and p75TC, so you can instantly place a new offer in the level's distribution without any client-side aggregation
  • Salary band benchmarking — read p10/p25/p50/p75/p90 straight off a single record to see the entire compensation distribution at a glance
  • Comp gap detection — filter rows where baseSalary < medianBase to surface roles or companies where your base is below market (instant motivation to negotiate or move)
  • Benchmark a competing offer against the median TC at the same company × level × location before signing
  • Pinpoint the right level — find out whether the company is offering an L4 number for an L5 scope
  • Stack-rank target companies by total comp for your seniority before kicking off an interview loop
  • Build a per-component breakdown — base vs RSU vs bonus — so you can talk concretely about cliffs, refreshers, and vesting

Recruiter & Talent Intelligence

In-house recruiters and external search firms use Levels.fyi data to close more offers:

  • Recruiter intel with reliability scoringlevelSampleCount plus the p10/p90 spread tells you at a glance whether a level's distribution is statistically meaningful or built on three submissions
  • Build market-aligned offer ranges before sourcing, so packages don't get rejected at the verbal-offer stage
  • Identify which competitors out-pay you at each level — the source of most candidate losses
  • Tailor counter-offers with sourced benchmarks rather than gut feel
  • Coach hiring managers on what "Senior" actually costs in SF Bay Area vs Austin vs Toronto
  • Map the comp ladder for emerging companies (post-Series-D, post-IPO) where public salary disclosures don't exist yet

Tech HR & Total-Rewards Strategy

Heads of People, comp analysts, and total-rewards consultants use this data to:

  • Refresh comp bands quarterly without paying for full Radford / Mercer / Aon survey access every cycle
  • Validate Radford / Mercer / OptionImpact against crowdsourced ground truth
  • Justify level-realignment proposals to the exec team with external comparables
  • Run geographic differentials — what does the same L5 cost in Seattle vs NYC vs Boston vs London vs Bangalore?
  • Model the cost of a level promotion at scale across an engineering org

VC Portfolio Salary Benchmarking

Operating partners, portfolio talent leads, and platform teams at VC funds use this dataset to:

  • Build a portfolio-wide comp benchmark — "what does a Series B startup pay a Staff Engineer in 2026?"
  • Push back on founder offers that are over or under market and burn runway prematurely
  • Compare cash vs equity weighting between portfolio companies and Big Tech
  • Inform secondaries and tender-offer pricing with current comp comparables
  • Brief LP updates with hard data on how engineering-cost trends are reshaping burn

Founder Hiring Plan Budgeting

Early-stage and growth-stage founders use this scraper to:

  • Build a defensible engineering payroll model for the next 18 months at seed / Series A / Series B
  • Decide where to base the team — a Toronto or Bangalore satellite can cut average TC 40-60% vs SF-only
  • Set founding-engineer equity bands by reverse-engineering the Big Tech alternative a candidate is walking away from
  • Plan funding rounds around realistic salary requirements, not optimistic assumptions

Tech Journalism & Comp Reporting

Reporters at The Information, Bloomberg Tech, Business Insider, Wired, Rest of World, and independent newsletters use Levels.fyi data to:

  • Source "what FAANG pays" features with up-to-date, attribution-friendly aggregate figures
  • Track the comp impact of layoffs / hiring freezes / RIFs across companies
  • Compare AI-lab compensation (OpenAI, Anthropic, DeepMind, xAI) against incumbent Big Tech
  • Quantify the post-IPO comp reset when high-flying private companies go public

Sell-side and buy-side analysts covering Big Tech use this dataset as a forward indicator of:

  • Track median TC drift over time — schedule weekly runs and diff medianTC per (company × level) to catch comp resets, refreshers, and quiet pay cuts months before earnings calls
  • R&D opex pressure at MSFT, GOOG, META, AAPL, AMZN, NVDA before it shows up in 10-Qs
  • Talent flight signals — when a target company's median TC drifts below peers, retention erodes
  • Capacity buildouts — new ML-engineer hiring at AI labs predicts future cloud and compute commitments
  • Geographic restructuring — Bangalore / Dublin / Toronto hub growth vs SF / Seattle contraction
  • Stock-comp dilution forecasts — pairing TC with grant-vesting schedules surfaces future dilution

Compensation Transparency Advocacy

Pay-transparency activists, employee resource groups, and DEI-comp consultants use this data to:

  • Document pay disparities between in-office and remote, US and international hires
  • Support state-level pay-transparency laws (CA, CO, WA, NY) with empirical benchmark data
  • Identify under-leveling patterns — promotions delayed relative to market for specific groups
  • Provide free benchmarking to underrepresented candidates who otherwise lack negotiation data

Sample Queries & Recipes

Recipe 1: Compare FAANG SWE comp in SF Bay Area

{
"companies": ["google", "meta", "amazon", "apple", "netflix"],
"jobFamilies": ["software-engineer"],
"location": "san-francisco-bay-area",
"maxResults": 200
}

Recipe 2: AI-lab compensation benchmark

{
"companies": ["openai", "anthropic", "google-deepmind", "xai", "cohere"],
"jobFamilies": ["machine-learning-engineer", "software-engineer"],
"maxResults": 150
}

Recipe 3: Senior PM comp across post-IPO unicorns

{
"companies": ["stripe", "databricks", "snowflake", "airbnb", "uber", "lyft"],
"jobFamilies": ["product-manager"],
"maxResults": 200
}

Recipe 4: Geographic comp differential — same role, five cities

{
"companies": ["google"],
"jobFamilies": ["software-engineer"],
"location": "seattle",
"maxResults": 50
}

Run separately with location set to san-francisco-bay-area, new-york, austin, london, and bangalore — then diff in your spreadsheet to quantify the geographic premium.

Recipe 5: Engineering Manager ladder at Big Tech

{
"companies": ["google", "meta", "amazon", "microsoft", "apple"],
"jobFamilies": ["engineering-manager"],
"maxResults": 200
}

Recipe 6: Direct URL crawl — preserve manual filters

{
"urls": [
"https://www.levels.fyi/companies/google/salaries/software-engineer?location=seattle",
"https://www.levels.fyi/companies/meta/salaries/product-manager?location=new-york",
"https://www.levels.fyi/companies/amazon/salaries/data-scientist?location=austin"
],
"maxResults": 300
}

Recipe 7: Sample run — sanity-check 10 records before a full crawl

{
"companies": ["google"],
"jobFamilies": ["software-engineer"],
"maxResults": 10
}

Integration Examples

Google Sheets

  1. Schedule the actor weekly in Apify
  2. Add the Export to Google Sheets integration
  3. Open the sheet — =AVERAGE(...), pivot tables, and charts work directly on the typed numeric fields

Make.com / Zapier / n8n

Use the Apify connector. Useful triggers:

  • New median TC delta > $20k week-over-week for a watchlist company
  • New record where level = "L7" and totalCompensation > $1M
  • Geographic-pay-gap watch — first SF record under the 25th percentile of NYC
  • Job-family expansion — new submissions for a role that previously had none

Power BI / Tableau / Looker

Wire the Apify dataset REST endpoint as a data source. Build dashboards covering median TC by company × level, geographic premium curves (SF vs Seattle vs NYC vs Toronto vs London vs Bangalore), base-vs-stock-vs-bonus mix shifts over time, YoE-to-TC scatter plots, and per-company comp ladder steepness (L3→L8 multiplier).

Postgres / Snowflake / BigQuery

Use the Apify webhook integration to POST run results to your ingest endpoint. Suggested schema:

CREATE TABLE levels_fyi_comp (
id BIGSERIAL PRIMARY KEY,
company TEXT, company_slug TEXT, job_family TEXT, level TEXT,
base_salary NUMERIC, stock_grant NUMERIC, bonus NUMERIC, total_compensation NUMERIC,
years_of_experience NUMERIC, years_at_company NUMERIC,
location TEXT, country TEXT, offer_date DATE, median_tc NUMERIC,
source_url TEXT, scraped_at TIMESTAMPTZ
);
CREATE INDEX idx_company_level ON levels_fyi_comp (company_slug, level);
CREATE INDEX idx_scraped_at ON levels_fyi_comp (scraped_at);

Greenhouse / Lever / Ashby / Workday — ATS Enrichment

When a candidate moves to "Offer" stage, fire a webhook that looks up the candidate's current employer × role × level via this actor and pre-populates the comp-justification field with median TC.

Salesforce / HubSpot — Recruiter CRM

Tag prospective hires with the median TC at their current company × level. Recruiters see the number in the contact view and stop sending insulting offers.


Major Tech Hubs Covered

Metro / RegionWhy It MattersTypical Senior SWE TC Range (USD)
San Francisco Bay AreaGlobal epicenter — highest comp band, deepest data$400k–$700k
SeattleAMZN/MSFT/Meta presence, no state income tax$360k–$600k
New York CityFinance + tech overlap (HFT, Bloomberg, Two Sigma)$380k–$650k
AustinTexas hub for AAPL, GOOG, Tesla, Indeed$300k–$500k
BostonHubSpot, Wayfair, Toast, Klaviyo, biotech-tech$280k–$480k
Los AngelesSnap, Disney, Riot, SpaceX$300k–$520k
TorontoLargest Canadian hub — Shopify, Wealthsimple, Big Tech satellitesCAD 200k–CAD 350k
LondonEuropean HQ for most Big TechGBP 130k–GBP 280k
DublinEU tax-favored HQ for Google, Meta, Apple, StripeEUR 110k–EUR 220k
BerlinSAP, N26, Delivery Hero, Zalando, Big Tech R&DEUR 90k–EUR 180k
Tel AvivWorld-class R&D centers for AAPL, GOOG, MSFT, META, NVDANIS 600k–NIS 1.1M
BangaloreLargest non-US Big Tech hub — Google, Meta, AMZN, MSFTINR 70 LPA–INR 200+ LPA
SingaporeAPAC HQ for Stripe, Meta, ByteDance, ShopeeSGD 180k–SGD 350k
SydneyAtlassian, Canva, Google AU, AWS APACAUD 220k–AUD 400k

All numeric examples are illustrative — actual values are scraped live from Levels.fyi at run time.


Cost & Performance

MetricValue
EnginePlaywright (Chromium, headed, anti-detection) + Cheerio fallback
Runtime (single company × role)20–60 seconds
Runtime (10 companies × 1 role)4–10 minutes
Runtime (50 companies × 3 roles)30–60 minutes
Cost per runPay-per-event — typically pennies for a single company, low single-digit dollars for large multi-company crawls
Pricing modelPay-per-event (transparent per-record billing)
Data freshnessLive at run time — reflects the latest submissions visible on Levels.fyi
Auth requiredNone
Proxy requiredApify Proxy enabled by default; residential pool recommended when Cloudflare gets aggressive
ConcurrencySafe to run multiple parallel configurations with different company/role splits
Memory footprint1024 MB minimum, 4096 MB recommended for large multi-company crawls

  • Public data only — every field returned is published openly on levels.fyi and visible to any web visitor without authentication
  • No PII — Levels.fyi submissions are crowdsourced anonymously by definition; the dataset contains no names, emails, phone numbers, employee IDs, or contact info
  • No PHI / financial-account data — only compensation figures, levels, and high-level offer metadata
  • Respect Levels.fyi's Terms of Service — keep crawl volume modest, use requestDelay, do not redistribute the dataset commercially without verifying ToS, do not rebuild the site as a clone
  • GDPR / CCPA — submissions are anonymous and do not constitute personal data under most jurisdictions, but the data consumer is responsible for compliance with applicable privacy law
  • Attribution — cite Levels.fyi when republishing aggregate figures publicly
  • Negotiation use — using crowdsourced comp data in personal salary negotiation is, in nearly every jurisdiction, completely legal and increasingly expected

Important: This actor is built for research, benchmarking, transparency, and individual negotiation support. Do not use it for any purpose that would breach Levels.fyi's ToS or applicable law.


Frequently Asked Questions

How fresh is the data?

Live at run time. Each run hits Levels.fyi directly and parses whatever the site currently shows. Levels.fyi accepts new compensation submissions continuously, so daily or weekly schedules will capture the latest entries.

How many records will I get per company × role?

It depends on how many anonymous submissions Levels.fyi has for that page. Popular pages (Google SWE, Meta SWE, Amazon SWE) often return dozens of individual submissions plus aggregate percentiles. Niche pages may return only a percentile rollup or a single median record. The maxResults cap applies across all tasks.

Does this require a Levels.fyi login or API key, and does Cloudflare block it?

No login or API key — Levels.fyi has no public API, and this actor uses no session cookie. Cloudflare is handled by a real Playwright Chromium browser with anti-detection patches and exponential-backoff retries. If a specific Apify Proxy IP gets challenged repeatedly, increase requestDelay or switch the proxy group to residential.

Can I scrape compensation data outside the US?

Yes. Levels.fyi has international submissions for London, Dublin, Berlin, Amsterdam, Tel Aviv, Toronto, Vancouver, Bangalore, Hyderabad, Singapore, Tokyo, Sydney, and more. Either pass an international location filter or hit URLs that already carry the ?location= query parameter via the urls input.

Are individual submitters identified?

No. Levels.fyi submissions are anonymous; the site does not publish names, emails, or contact info, and neither does this scraper.

Can I get a complete dump of every company on Levels.fyi?

The actor does not auto-discover companies — Levels.fyi covers 3,000+ companies and one-shotting them all is impractical. Maintain your own watchlist (S&P 500 tech, your portfolio companies, your competitor set) and pass slugs via companies.

Can I get historical data?

Levels.fyi shows current submissions. To build a historical series, schedule this actor weekly or monthly and persist each run's dataset to your warehouse — Apify retains all run datasets indefinitely on most plans.

What's the difference between an individual submission and a percentile rollup?

When Levels.fyi exposes per-submission samples for a level, you get one record per submission with YoE, location, and offer date. When only aggregates are published, you get a single rollup record with medianTC set to the p50 figure.

Does each row include the salary median?

Yes. Every sample row carries its level group's full distribution — medianTC, medianBase, medianStock, medianBonus, plus p10TC, p25TC, p75TC, p90TC, and levelSampleCount — computed in-actor from the level's submissions array. A single record is enough to place any offer in context:

{
"company": "Google", "level": "L3",
"baseSalary": 165000, "stockGrant": 38000, "bonus": 16500,
"totalCompensation": 219500,
"medianTC": 219850, "medianBase": 165000, "medianStock": 38000, "medianBonus": 16500,
"p10TC": 178400, "p25TC": 192700, "p75TC": 220900, "p90TC": 236400,
"levelSampleCount": 28
}

That means SQL like WHERE totalCompensation < p25TC (under-market) or WHERE totalCompensation > p75TC (top-quartile offers) works directly against the dataset — no GROUP BY required.

Can I filter by salary range, YoE, or specific location?

Apply range/YoE filters downstream — SQL WHERE, Python list comprehension, Sheets filter, Power BI slicer — on the typed numeric fields. The location input is a case-insensitive substring match against the record's location field (set seattle and you'll get "Seattle, WA" and "Greater Seattle Area"). For source-side filters, pass a Levels.fyi URL with the right ?location= query parameter via urls.

What if I want the role overview page instead of company × role?

Pass a /t/{jobFamily} URL directly via urls — for example https://www.levels.fyi/t/software-engineer. The extraction pipeline returns whatever Levels.fyi exposes in __NEXT_DATA__ for that page.

Does this work on the Apify Free Plan, and can I schedule it?

Yes to both — full functionality on the free tier (pay-per-event means a test run costs a few cents), and Apify's built-in Scheduler supports any cron expression. Pair with a Google Sheets / webhook / Postgres integration for fully automated comp refresh.

What export formats are supported?

JSON, CSV, Excel (XLSX), HTML, XML, RSS, and JSON Lines — directly from the Apify dataset view or the REST API.

How are errors and partial results handled?

When a single company × role page fails (Cloudflare hard-blocks, slug 404s, the role doesn't exist on that company), the actor logs a warning and moves on. Successfully scraped records are still pushed. Inspect the run log for the exact label of failures.

Why are some records missing level or yearsOfExperience?

Aggregate percentile records (pageProps.percentiles fall-through) and some legacy submissions don't expose those fields. The schema returns null rather than dropping the record, so the TC / base / stock figures remain usable.

Is this a Levels.fyi API alternative?

For read-only public compensation data, yes — this is the practical alternative when you need structured Levels.fyi data programmatically without a documented public API. For commercial redistribution, contact Levels.fyi directly.


If you're building tech-comp, talent, or HR-intelligence pipelines, these companion actors are useful next stops:


Comparison vs. Alternatives

ApproachSetup timeData freshnessCost (1 company, ~30 records)Cloudflare handledSchema normalizationSchedulable
This actor< 1 minuteLivePenniesBuilt-inBuilt-inYes (Apify scheduler)
Manual Levels.fyi browsing + spreadsheet copyHours per companyStale immediatelyFreeN/ADIYNo
Custom Playwright scriptDaysLiveFree + infra + maintenanceDIYDIYDIY
Paid comp-survey subscription (Radford, Mercer, Aon)Weeks (procurement)Quarterly$20k–$100k+ /yrN/AYesYes
Internal recruiter heads-up callsDays per data pointAnecdotalSoft-costN/ANoneNo
Levels.fyi Premium / End GameMinutesLivePer-user subscriptionN/AUI onlyNo

Why Pay-Per-Event Pricing?

This actor uses Apify's pay-per-event pricing model, which means:

  • You only pay when the actor actually runs and returns records
  • Charges scale with how many compensation records you actually consume
  • No monthly subscription, no recurring minimum
  • Transparent line-item billing inside the Apify Console
  • Free to evaluate — run with maxResults: 5 for a few cents before committing to a bigger crawl

Changelog

VersionDateNotes
1.12026-05Added in-actor per-level percentile computation: every sample row now carries medianTC, medianBase, medianStock, medianBonus, p10TC/p25TC/p75TC/p90TC, and levelSampleCount — interpolated from each level group's submissions array
1.02026-05Initial public release — Playwright Cloudflare-aware fetch, markdown-first parser, three-tier __NEXT_DATA__ fallback (averages.samplespercentilesmedian), HTML-table parser, normalized integer USD output, configurable requestDelay, location filter, direct-URL mode

Keywords

Levels.fyi scraper · Levels.fyi API alternative · Levels.fyi data extraction · tech salary data · tech compensation scraper · software engineer compensation scraper · FAANG comp data · total compensation benchmarking · tech offer letter data · L4 L5 L6 comp data · L7 L8 staff principal engineer salary · E5 E6 E7 Meta compensation · Google SWE compensation data · Meta SWE comp scraper · Amazon SDE salary data · Apple ICT compensation · Microsoft 63 65 67 levels · Netflix SWE salary · Nvidia ML engineer comp · Stripe Databricks Snowflake comp data · base salary stock grant bonus scraper · RSU vesting schedule data · tech recruiter intelligence · tech HR comp strategy data · VC portfolio salary benchmarking · founder hiring plan budgeting · tech journalism comp reporting · equity research engineering cost · pay transparency tech data · SF Bay Area tech salary · Seattle tech salary · NYC tech salary · Austin tech salary · Boston tech salary · Toronto tech salary · London tech salary · Bangalore tech salary · machine learning engineer compensation · product manager compensation scraper · data scientist comp benchmarking · engineering manager comp data · technical program manager salary · Apify Levels.fyi actor · tech salary negotiation data · crowdsourced compensation API · tech compensation transparency · annual TC scraper · stock refresher RSU benchmark


Support

  • Bug reports: Use the Issues tab on the Apify Store page
  • Feature requests: Same place — describe the company, role, or comp dimension you need
  • Direct contact: Through the Apify developer profile

If this actor saves you an evening of clicking around Levels.fyi or rescues a salary negotiation, a 5-star rating on the Apify Store helps other engineers, recruiters, and comp analysts find it. Thank you.