# H-1B LCA Visa Wage & Employer Data Scraper (`parseforge/h1b-lca-disclosure-scraper`) Actor

Scrape US DOL H-1B Labor Condition Application records: employer, job title, base salary, prevailing wage, work location, case status, SOC/NAICS codes, and decision dates.

- **URL**: https://apify.com/parseforge/h1b-lca-disclosure-scraper.md
- **Developed by:** [ParseForge](https://apify.com/parseforge) (community)
- **Categories:** Business, Jobs, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

![ParseForge Banner](https://raw.githubusercontent.com/ParseForge/apify-assets/main/banner.jpg)

## 🛂 H-1B LCA Disclosure Scraper

> 🚀 **Pull every public H-1B Labor Condition Application in seconds.** Filter by employer, job title, work city, or year. No API key, no manual XLSX wrangling, no PDF parsing.

> 🕒 **Last updated:** 2026-05-16 · **📊 60+ fields** per case · **8M+ certified cases** searchable · **2012 to 2025** coverage · **Hourly + annual wages**

Every year the US Department of Labor's Office of Foreign Labor Certification (OFLC) publishes hundreds of thousands of Labor Condition Application (LCA) decisions for H-1B, H-1B1, and E-3 visa workers. Buried in those records is the only public wage data the federal government publishes at the individual case level, along with the sponsoring employer, the job title, the SOC occupation code, the worksite address, the prevailing wage determination, and the immigration attorney representing the employer. Until now the only way to query it was to download a 600 MB quarterly XLSX file and write your own pivot tables.

This scraper turns that public disclosure data into a clean JSON or CSV feed. Plug in an employer name, a city, or a year and get back per-case records with the case status, decision date, base salary, wage range, prevailing wage level, employer POC, and law firm filing on the petition. Use it to benchmark visa sponsorship wages, build a sponsor lookup for an immigration practice, score employer H-1B dependency, or feed a salary comparison product. The data is government-sourced, court-admissible, and refreshed quarterly by DOL.

| 🎯 Built for | 💡 Common use cases |
|---|---|
| 🧑‍⚖️ Immigration attorneys | Case prep, employer due diligence, RFE responses |
| 🏢 Corporate HR + mobility teams | Wage benchmarking, sponsor compliance audits |
| 🧮 Salary research products | Augment Levels.fyi / Glassdoor with visa wages |
| 🎓 Policy researchers + journalists | Track H-1B sponsor concentration, wage trends |

---

### 📋 What the H-1B LCA Disclosure Scraper does

- 🔍 **Search by employer.** Pull every LCA filed by Stripe, Google, Microsoft, Infosys, or any sponsor in the disclosure dataset.
- 🏙️ **Search by work city or year.** Build wage tables for a specific metro or a specific fiscal year.
- 💼 **Search by job title.** Get every Software Engineer, Data Scientist, or Account Executive LCA across all sponsors.
- 💰 **Capture full wage detail.** Base wage, wage range (from + to), wage unit (Year / Hour / Month), prevailing wage, and PW level (I-IV).
- 📂 **Capture full case lifecycle.** Submit date, decision date, employment start + end, status (Certified / Denied / Withdrawn).
- 🏛️ **Capture attorney + law firm filings.** When an agent represents the employer the dataset includes the lawyer name, email, phone, and law firm.

Each record carries the canonical DOL case number, full employer address with NAICS code, the employer point of contact (name, title, phone, email), the worksite address and county, the H-1B dependency flag, the willful violator flag, and the public disclosure election. 60+ fields per case, every value populated when it is filed on the source LCA.

> 💡 **Why it matters:** H-1B wage data is the only public per-person wage dataset the US government publishes. It is the source of truth for visa sponsorship benchmarks, immigration case prep, and any salary product that wants to cover the non-resident segment of the US tech labor market.

---

### 🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing a sponsor lookup, a wage benchmark query, and a salary-by-city export.

---

### ⚙️ Input

<table>
<thead>
<tr><th>Field</th><th>Type</th><th>Required</th><th>What it does</th></tr>
</thead>
<tbody>
<tr><td><code>employer</code></td><td>string</td><td>no</td><td>Filter by sponsor name (case insensitive, partial match). Example: <code>STRIPE</code>, <code>GOOGLE LLC</code>, <code>INFOSYS</code>.</td></tr>
<tr><td><code>jobTitle</code></td><td>string</td><td>no</td><td>Filter by job title keyword. Example: <code>SOFTWARE ENGINEER</code>, <code>DATA SCIENTIST</code>, <code>FINANCIAL ANALYST</code>.</td></tr>
<tr><td><code>city</code></td><td>string</td><td>no</td><td>Filter by employment city. Example: <code>SAN FRANCISCO</code>, <code>AUSTIN</code>, <code>NEW YORK</code>.</td></tr>
<tr><td><code>year</code></td><td>enum</td><td>no</td><td>Single fiscal year (2012 to 2025) or <code>All Years</code>. Default <code>2024</code>. Combine with at least one of employer / job / city for best results.</td></tr>
<tr><td><code>includeDetails</code></td><td>boolean</td><td>no</td><td>Default <code>true</code>. Fetch the per-case detail page to enrich each record with status, decision date, SOC code, NAICS, employer POC, attorney filing, prevailing wage, and worksite address. Set to <code>false</code> for a faster listing-only scrape.</td></tr>
<tr><td><code>startUrl</code></td><td>string</td><td>no</td><td>Paste a search URL from the source site to bypass the filter fields. Useful when copying a saved query.</td></tr>
<tr><td><code>maxItems</code></td><td>integer</td><td>no</td><td>Free plan: capped at 10 (preview). Paid plan: up to 1,000,000.</td></tr>
</tbody>
</table>

Example: every LCA Stripe filed in 2024, full detail.

```json
{
    "employer": "STRIPE",
    "year": "2024",
    "includeDetails": true,
    "maxItems": 500
}
````

Example: every Austin LCA filed in 2024, listing only (fast).

```json
{
    "city": "AUSTIN",
    "year": "2024",
    "includeDetails": false,
    "maxItems": 5000
}
```

> ⚠️ **Good to Know:** Combining an empty employer with `All Years` returns no rows because the source site refuses unbounded queries. Pin at least one of employer, job, city, or a specific year to get results.

***

### 📊 Output

Every record is a single LCA case with the employer, job, salary, work location, dates, and the full DOL filing detail.

#### 🧾 Schema

| Field | Type | Example |
|---|---|---|
| 🆔 `caseNumber` | string | `I-200-24138-006531` |
| ✅ `status` | string | `Certified` |
| 🔗 `caseUrl` | string | `https://h1bdata.info/details.php?id=I-200-24138-006531` |
| 🛂 `visaClass` | string | `H-1B` |
| 🏢 `employerLegalName` | string | `Stripe, Inc.` |
| 💼 `jobTitle` | string | `APPLICATION SECURITY ENGINEER` |
| 📑 `socCode` | string | `15-1212.00` |
| 📑 `socTitle` | string | `Information Security Analysts` |
| 💰 `baseSalary` | number | `169395` |
| 💵 `wageRateFrom` | number | `169395` |
| 💵 `wageRateTo` | number | `250000` |
| 📐 `wageRateUnit` | string | `Year` |
| 📊 `prevailingWage` | number | `169395` |
| 🪜 `prevailingWageLevel` | string | `IV` |
| 🗓️ `prevailingWageOesYear` | string | `7/1/2023 - 6/30/2024` |
| 🏙️ `workCity` | string | `SOUTH SAN FRANCISCO` |
| 🗺️ `workState` | string | `CA` |
| 📅 `submitDate` | string | `2024-05-16` |
| 📅 `decisionDate` | string | `2024-05-23` |
| 📅 `employmentStartDate` | string | `2024-10-25` |
| 📅 `employmentEndDate` | string | `2027-10-24` |
| 👥 `totalWorkerPositions` | integer | `1` |
| 🏢 `naicsCode` | string | `522320` |
| 📞 `employerPhone` | string | `14157379490` |
| ✉️ `employerPocEmail` | string | `drivera@stripe.com` |
| ⚖️ `lawFirmName` | string | `FRAGOMEN, DEL REY, BERNSEN & LOEWY, LLP` |
| 🚩 `h1bDependent` | boolean | `false` |
| 🚨 `willfulViolator` | boolean | `false` |
| 🕒 `scrapedAt` | string | `2026-05-16T04:30:07.640Z` |

(60+ fields total. Worksite address, POC contact info, attorney filing, and case workforce breakdown are all included when the source LCA carries them.)

#### 📦 Sample records

<details>
<summary>🟢 Typical: certified H-1B with attorney filing (Stripe, application security engineer)</summary>

```json
{
    "caseNumber": "I-200-24138-006531",
    "status": "Certified",
    "caseUrl": "https://h1bdata.info/details.php?id=I-200-24138-006531",
    "visaClass": "H-1B",
    "employerName": "STRIPE INC",
    "employerLegalName": "Stripe, Inc.",
    "jobTitle": "APPLICATION SECURITY ENGINEER",
    "socCode": "15-1212.00",
    "socTitle": "Information Security Analysts",
    "fullTimePosition": true,
    "baseSalary": 169395,
    "wageRateFrom": 169395,
    "wageRateTo": 250000,
    "wageRateUnit": "Year",
    "prevailingWage": 169395,
    "prevailingWageUnit": "Year",
    "prevailingWageLevel": "IV",
    "prevailingWageOesYear": "7/1/2023 - 6/30/2024",
    "workCity": "SOUTH SAN FRANCISCO",
    "workState": "CA",
    "worksiteAddress": "354 Oyster Point Blvd",
    "worksiteCity": "South San Francisco",
    "worksiteCounty": "SAN MATEO",
    "worksiteState": "CA",
    "worksitePostalCode": "94080",
    "worksiteWorkers": 1,
    "totalWorksiteLocations": 2,
    "submitDate": "2024-05-16",
    "decisionDate": "2024-05-23",
    "employmentStartDate": "2024-10-25",
    "employmentEndDate": "2027-10-24",
    "totalWorkerPositions": 1,
    "newEmployment": 0,
    "continuedEmployment": 0,
    "changePreviousEmployment": 1,
    "newConcurrentEmployment": 0,
    "changeEmployer": 0,
    "amendedPetition": 0,
    "employerAddress": "354 Oyster Point Blvd",
    "employerCity": "South San Francisco",
    "employerState": "CA",
    "employerPostalCode": "94080",
    "employerCountry": "UNITED STATES OF AMERICA",
    "employerPhone": "14157379490",
    "naicsCode": "522320",
    "employerPocFirstName": "Diego",
    "employerPocLastName": "Rivera",
    "employerPocJobTitle": "Talent Mobility Program Specialist",
    "employerPocEmail": "drivera@stripe.com",
    "agentRepresentingEmployer": true,
    "agentAttorneyFirstName": "Daniel",
    "agentAttorneyLastName": "Williamson",
    "agentAttorneyEmail": "dwilliamson@fragomen.com",
    "agentAttorneyPhone": "12022235515",
    "lawFirmName": "FRAGOMEN, DEL REY, BERNSEN & LOEWY, LLP",
    "h1bDependent": false,
    "willfulViolator": false,
    "publicDisclosure": "Disclose Business",
    "secondaryEntity": false,
    "scrapedAt": "2026-05-16T04:30:07.640Z"
}
```

</details>

<details>
<summary>🟡 Edge case: hourly wage filing, staffing employer, no outside counsel (Austin .NET developer)</summary>

```json
{
    "caseNumber": "I-200-24247-309226",
    "status": "Certified",
    "caseUrl": "https://h1bdata.info/details.php?id=I-200-24247-309226",
    "visaClass": "H-1B",
    "employerName": "JUDGE TECHNICAL SERVICES INC",
    "employerLegalName": "JUDGE TECHNICAL SERVICES INC",
    "jobTitle": ".NET C DEVELOPER",
    "socCode": "15-1252.00",
    "socTitle": "Software Developers",
    "fullTimePosition": true,
    "baseSalary": 127180,
    "wageRateFrom": 63.59,
    "wageRateUnit": "Hour",
    "prevailingWage": 63.59,
    "prevailingWageUnit": "Hour",
    "prevailingWageLevel": "III",
    "prevailingWageOesYear": "7/1/2024 - 6/30/2025",
    "workCity": "AUSTIN",
    "workState": "TX",
    "worksiteAddress": "2309 GRACY FARMS LANE",
    "worksiteCity": "AUSTIN",
    "worksiteCounty": "TRAVIS",
    "worksiteState": "TX",
    "worksitePostalCode": "78758",
    "worksiteWorkers": 1,
    "totalWorksiteLocations": 1,
    "submitDate": "2024-09-03",
    "decisionDate": "2024-09-10",
    "employmentStartDate": "2024-09-16",
    "employmentEndDate": "2027-09-01",
    "totalWorkerPositions": 1,
    "changeEmployer": 1,
    "employerAddress": "151 SOUTH WARNER ROAD",
    "employerCity": "WAYNE",
    "employerState": "PA",
    "employerPostalCode": "19087",
    "employerCountry": "UNITED STATES OF AMERICA",
    "employerPhone": "16106677700",
    "naicsCode": "541511",
    "employerPocFirstName": "JEFFREY",
    "employerPocLastName": "SCHOENER",
    "employerPocJobTitle": "IMMIGRATION SUPERVISOR",
    "employerPocEmail": "JSCHOENER@JUDGE.COM",
    "agentRepresentingEmployer": false,
    "h1bDependent": false,
    "willfulViolator": false,
    "publicDisclosure": "Disclose Business",
    "secondaryEntity": true,
    "scrapedAt": "2026-05-16T04:33:33.976Z"
}
```

</details>

<details>
<summary>🔴 Sparse: denied filing with minimal employer detail and no attorney (Fremont small employer)</summary>

```json
{
    "caseNumber": "I-200-24060-757878",
    "status": "Denied",
    "caseUrl": "https://h1bdata.info/details.php?id=I-200-24060-757878",
    "visaClass": "H-1B",
    "employerName": "FLEXON",
    "employerLegalName": "flexon",
    "jobTitle": "SOFTWARE",
    "socCode": "11-2022.00",
    "socTitle": "Sales Managers",
    "fullTimePosition": true,
    "baseSalary": 120000,
    "wageRateFrom": 60,
    "wageRateTo": 100,
    "wageRateUnit": "Hour",
    "prevailingWage": 56.07,
    "prevailingWageUnit": "Hour",
    "prevailingWageLevel": "III",
    "workCity": "FREMONT",
    "workState": "CA",
    "submitDate": "2024-02-29",
    "decisionDate": "2024-03-04",
    "employmentStartDate": "2024-05-02",
    "employmentEndDate": "2025-07-16",
    "totalWorkerPositions": 1,
    "employerAddress": "greek way",
    "employerCity": "Fremont",
    "employerState": "CA",
    "employerPostalCode": "98043",
    "naicsCode": "332992",
    "employerPocFirstName": "feloni",
    "employerPocLastName": "syurey",
    "employerPocJobTitle": "Cheif executive officer",
    "agentRepresentingEmployer": false,
    "h1bDependent": false,
    "willfulViolator": false,
    "publicDisclosure": "Disclose Business",
    "scrapedAt": "2026-05-16T04:34:17.772Z"
}
```

</details>

***

### ✨ Why choose this Actor

| | Capability |
|---|---|
| 🏛️ | **Government-sourced data.** Records originate from DOL OFLC public disclosure files, the only court-admissible H-1B wage source. |
| 💰 | **Full wage detail.** Base wage, wage range, hourly + annual, prevailing wage, and PW level on every record. |
| 🔁 | **Quarterly refresh.** Source data is updated every quarter by DOL, so your benchmarks track the live H-1B market. |
| 🎯 | **Search the way attorneys think.** Filter by sponsor name, work city, job title, or fiscal year. Combine any of them. |
| ⚖️ | **Attorney + law firm detail.** When outside counsel files the petition, the lawyer name, email, phone, and firm are all captured. |
| 🧾 | **60+ fields per case.** Worksite address, NAICS, employer POC, H-1B dependency, willful violator flag, public disclosure election. |
| 🚀 | **No registration anywhere.** No DOL account, no captcha, no manual XLSX wrangling. Hit run, get JSON or CSV. |

> 📊 Over 8 million H-1B / H-1B1 / E-3 LCA decisions are searchable across fiscal years 2012 to 2025.

***

### 📈 How it compares to alternatives

| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| **⭐ H-1B LCA Disclosure Scraper** *(this Actor)* | Pay per case | 2012 to 2025, all visa classes | Quarterly with DOL releases | Employer + job + city + year | Zero config |
| Official quarterly downloads | Free | Latest 4 quarters per file | Quarterly | XLSX, no UI | Spreadsheet wrangling per file |
| Paid live APIs | Subscription | Often US-only, gated | Varies | Limited | Account + key + quotas |
| Legacy community dumps | Free | Often years out of date | Rarely | None | DIY parsing |
| Manual case status lookup | Free | One case at a time | Live | Case number required | Captcha per query |

Most teams that try the official XLSX route give up at the second quarter. The scraper turns the same data into a JSON feed you can query like any other API.

***

### 🚀 How to use

1. 🆕 **Create a free Apify account.** [Sign up here](https://console.apify.com/sign-up?fpr=vmoqkp). No credit card needed for the preview tier.
2. 🔎 **Open the H-1B LCA Disclosure Scraper page** in the Apify Console and click "Try for free".
3. ✏️ **Fill in your filters.** Pick a sponsor, a city, a job title, a year, or any combination.
4. ▶️ **Click Start.** The Actor pulls the matching cases and writes one record per LCA.
5. 📥 **Export the dataset.** Download as JSON, CSV, Excel, or HTML, or pipe it into your data warehouse via the Apify API.

> ⏱️ **Total time:** under 60 seconds from sign-up to first dataset export.

***

### 💼 Business use cases

<table>
<tr>
<td width="50%">

#### 🧑‍⚖️ Immigration attorneys

- Sponsor due diligence before taking on a new corporate client
- Wage evidence for RFE responses (other certified cases at the same SOC + level)
- Track competitor law firm filings against shared sponsors
- Build a private case-law database keyed by employer or job title

</td>
<td width="50%">

#### 🏢 Corporate HR + global mobility

- Benchmark visa sponsorship wages against your offer letters
- Detect H-1B dependent flags before extending an offer
- Audit your own filings for SOC + level consistency across worksites
- Pre-fill prevailing wage tables for new requisitions

</td>
</tr>
<tr>
<td width="50%">

#### 🧮 Salary research products

- Augment Levels.fyi / Glassdoor / Payscale with visa wage data
- Show users the H-1B sponsor pool for a given role + city
- Build comp dashboards filtered by NAICS or SOC
- Surface employer wage outliers for editorial coverage

</td>
<td width="50%">

#### 📊 Recruiting + sales intelligence

- Lead lists of H-1B-heavy sponsors by industry
- Score companies by visa filing volume for sponsorship pitches
- Track new market entrants (first H-1B filings in a metro)
- Identify staffing firms and consultancies by NAICS + filing pattern

</td>
</tr>
</table>

***

### 🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

<table>
<tr>
<td width="50%">

#### 🎓 Research and academia

- Empirical datasets for papers, thesis work, and coursework
- Longitudinal studies tracking changes across snapshots
- Reproducible research with cited, versioned data pulls
- Classroom exercises on data analysis and ethical scraping

</td>
<td width="50%">

#### 🎨 Personal and creative

- Side projects, portfolio demos, and indie app launches
- Data visualizations, dashboards, and infographics
- Content research for bloggers, YouTubers, and podcasters
- Hobbyist collections and personal trackers

</td>
</tr>
<tr>
<td width="50%">

#### 🤝 Non-profit and civic

- Transparency reporting and accountability projects
- Advocacy campaigns backed by public-interest data
- Community-run databases for local issues
- Investigative journalism on public records

</td>
<td width="50%">

#### 🧪 Experimentation

- Prototype AI and machine-learning pipelines with real data
- Validate product-market hypotheses before engineering spend
- Train small domain-specific models on niche corpora
- Test dashboard concepts with live input

</td>
</tr>
</table>

***

### 🔌 Automating H-1B LCA Disclosure Scraper

Drive the scraper from your own code via the Apify API. The Actor returns a dataset URL with the full JSON output, ready to push into a warehouse or a BI tool.

- 🟢 [Node.js client](https://docs.apify.com/api/client/js) for JavaScript and TypeScript apps
- 🐍 [Python client](https://docs.apify.com/api/client/python) for data science pipelines and ETL jobs
- 📚 [Apify API docs](https://docs.apify.com/api/v2) for raw REST access from any language

Schedule recurring runs to keep your sponsor lookup table fresh after every DOL quarterly release. The Apify scheduler can fire the Actor weekly, push the results into your storage, and notify your team via webhook when a new batch lands.

***

### ❓ Frequently Asked Questions

<details>
<summary>📅 <strong>How fresh is the data?</strong></summary>

DOL OFLC publishes new LCA disclosure data every fiscal quarter. The source the scraper reads is refreshed within days of each DOL release, so a scrape today reflects every case decided up to the end of the most recent quarter.

</details>

<details>
<summary>🌐 <strong>What visa classes are covered?</strong></summary>

H-1B, H-1B1 Singapore, H-1B1 Chile, and E-3 Australian. All four require an LCA to be filed with DOL before the employer can petition for the visa with USCIS or DOS.

</details>

<details>
<summary>💰 <strong>What wage field should I use for benchmarking?</strong></summary>

Use `baseSalary` for a single normalized number per case. Use `wageRateFrom` / `wageRateTo` / `wageRateUnit` if you need the raw range as filed. `prevailingWage` plus `prevailingWageLevel` tells you what DOL determined was the floor for that role and location.

</details>

<details>
<summary>🏢 <strong>Can I search by NAICS or SOC code?</strong></summary>

Not directly from the input filters, but `naicsCode`, `socCode`, and `socTitle` are on every record once you scrape. Pull a broad set (for example by year + city) and filter the resulting dataset by NAICS or SOC.

</details>

<details>
<summary>⚖️ <strong>Does the dataset include attorney information?</strong></summary>

Yes, when an outside agent or attorney filed the LCA. You get the attorney name, email, phone, and the law firm business name. Solo filings show `agentRepresentingEmployer: false` and no attorney fields.

</details>

<details>
<summary>📜 <strong>Is this data legal to use?</strong></summary>

Yes. LCA filings are public records published by DOL under the H-1B program disclosure rules. They are used by attorneys, researchers, and journalists every day. Always cite DOL OFLC as the source of record in any downstream product.

</details>

<details>
<summary>💼 <strong>Can I use this for commercial products?</strong></summary>

Yes. The underlying data is in the public domain. Build salary products, lead lists, dashboards, or research pipelines on top of it. Standard Apify commercial usage terms apply to the Actor itself.

</details>

<details>
<summary>💳 <strong>Do I need a paid plan?</strong></summary>

The free tier returns a 10-case preview so you can see exactly what comes back. Any production query (full sponsor history, a full city + year, salary tables) needs a paid plan with the higher item cap.

</details>

<details>
<summary>🔁 <strong>What happens if a run fails partway through?</strong></summary>

The scraper writes each case to the dataset as it is parsed, so partial output is preserved. Re-run with the same filters to top up; the source URL is deterministic, so you can resume safely.

</details>

<details>
<summary>🕒 <strong>How long does a full sponsor history take?</strong></summary>

A few seconds for a small sponsor (under 100 LCAs). A few minutes for a top sponsor (5,000+ LCAs per year). The scraper batches detail fetches with low concurrency to stay polite to the source.

</details>

<details>
<summary>📤 <strong>What output formats are supported?</strong></summary>

JSON, CSV, Excel (XLSX), HTML, RSS, and JSONL. Pick the format from the Apify Console after the run finishes, or query the dataset URL with the `format=` parameter via the Apify API.

</details>

<details>
<summary>🛟 <strong>Where do I report a missing field or a parsing bug?</strong></summary>

Use the contact form linked below. Include the case number, the field you expected, and what you got. Bugs reported through the form get triaged the same week.

</details>

***

### 🔌 Integrate with any app

- [**Zapier**](https://zapier.com/) - trigger flows on new LCA records
- [**Make**](https://www.make.com/) - low-code automation across SaaS apps
- [**n8n**](https://n8n.io/) - self-hosted automation with HTTP + database nodes
- [**Google Sheets**](https://docs.apify.com/platform/integrations/google-sheets) - push results straight into a tab
- [**Slack**](https://docs.apify.com/platform/integrations/slack) - notify a channel when a sponsor crosses a wage threshold
- [**Airbyte**](https://airbyte.com/) - sync the dataset into Snowflake, BigQuery, or Postgres

***

### 🔗 Recommended Actors

- [**💰 Levels.fyi Scraper**](https://apify.com/parseforge/levels-fyi-scraper) - compensation benchmarks for tech roles, ideal companion to H-1B wage data
- [**🏢 Glassdoor Scraper**](https://apify.com/parseforge/glassdoor-scraper) - employer reviews, salary ranges, and interview detail for the same sponsors
- [**💼 LinkedIn Jobs Scraper**](https://apify.com/parseforge/linkedin-jobs-scraper) - live job market context for any sponsor or city you pull from LCA data
- [**🏛️ USAJobs Scraper**](https://apify.com/parseforge/usajobs-scraper) - federal government job openings with salary ranges and series codes
- [**📋 FINRA BrokerCheck Scraper**](https://apify.com/parseforge/finra-brokercheck-scraper) - regulatory disclosure data for the financial industry, similar government-disclosure pattern

> 💡 **Pro Tip:** browse the complete [ParseForge collection](https://apify.com/parseforge) for more government-data and labor-market scrapers.

***

**🆘 Need Help?** [**Open our contact form**](https://tally.so/r/BzdKgA) and we will get back to you within one business day.

***

> ⚖️ **Disclaimer:** This scraper accesses public US Department of Labor disclosure data. Data is provided as-is for research, commercial, and personal use. ParseForge is not affiliated with the US Department of Labor or USCIS. Always verify case status and wage data against the official DOL source of record before relying on it in a legal proceeding.

# Actor input Schema

## `startUrl` (type: `string`):

Paste a search results URL from the source site to scrape exactly that view. When provided, the filter fields below are ignored. Example: https://h1bdata.info/index.php?em=GOOGLE\&job=\&city=\&year=2024

## `maxItems` (type: `integer`):

Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000

## `employer` (type: `string`):

Filter by employer/sponsor name (case insensitive, partial match allowed). Example: GOOGLE, MICROSOFT, AMAZON, META.

## `jobTitle` (type: `string`):

Filter by job title keyword (case insensitive, partial match allowed). Example: SOFTWARE ENGINEER, DATA SCIENTIST, PRODUCT MANAGER.

## `city` (type: `string`):

Filter by employment city (case insensitive, partial match allowed). Example: SAN FRANCISCO, NEW YORK, AUSTIN, SEATTLE.

## `year` (type: `string`):

Filter by LCA submission year. Pick a single fiscal year or 'All Years' for the full history. Empty employer + 'All Years' usually returns no rows, narrow with at least one other filter.

## `includeDetails` (type: `boolean`):

Fetch the per-case detail page for every result to enrich the record with case status, decision date, SOC code, NAICS code, employer address, employer phone, employer POC, and the full workforce position breakdown. Disable for a faster listing-only scrape.

## Actor input object example

```json
{
  "maxItems": 10,
  "year": "2024",
  "includeDetails": true
}
```

# Actor output Schema

## `overview` (type: `string`):

Table view with the most useful LCA fields: case number, status, employer, job title, salary, work location, dates.

## `fullData` (type: `string`):

Complete dataset with all 60+ extracted fields including prevailing wage detail, employer POC contact info, and attorney/law firm filings.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("parseforge/h1b-lca-disclosure-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "maxItems": 10 }

# Run the Actor and wait for it to finish
run = client.actor("parseforge/h1b-lca-disclosure-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "maxItems": 10
}' |
apify call parseforge/h1b-lca-disclosure-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=parseforge/h1b-lca-disclosure-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "H-1B LCA Visa Wage & Employer Data Scraper",
        "description": "Scrape US DOL H-1B Labor Condition Application records: employer, job title, base salary, prevailing wage, work location, case status, SOC/NAICS codes, and decision dates.",
        "version": "0.0",
        "x-build-id": "9ztvNmRC95mcDkr4u"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/parseforge~h1b-lca-disclosure-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-parseforge-h1b-lca-disclosure-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/parseforge~h1b-lca-disclosure-scraper/runs": {
            "post": {
                "operationId": "runs-sync-parseforge-h1b-lca-disclosure-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/parseforge~h1b-lca-disclosure-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-parseforge-h1b-lca-disclosure-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrl": {
                        "title": "Search URL (optional)",
                        "type": "string",
                        "description": "Paste a search results URL from the source site to scrape exactly that view. When provided, the filter fields below are ignored. Example: https://h1bdata.info/index.php?em=GOOGLE&job=&city=&year=2024"
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Free users: Limited to 10 items (preview). Paid users: Optional, max 1,000,000"
                    },
                    "employer": {
                        "title": "Employer",
                        "type": "string",
                        "description": "Filter by employer/sponsor name (case insensitive, partial match allowed). Example: GOOGLE, MICROSOFT, AMAZON, META."
                    },
                    "jobTitle": {
                        "title": "Job Title",
                        "type": "string",
                        "description": "Filter by job title keyword (case insensitive, partial match allowed). Example: SOFTWARE ENGINEER, DATA SCIENTIST, PRODUCT MANAGER."
                    },
                    "city": {
                        "title": "Work City",
                        "type": "string",
                        "description": "Filter by employment city (case insensitive, partial match allowed). Example: SAN FRANCISCO, NEW YORK, AUSTIN, SEATTLE."
                    },
                    "year": {
                        "title": "Year",
                        "enum": [
                            "All Years",
                            "2025",
                            "2024",
                            "2023",
                            "2022",
                            "2021",
                            "2020",
                            "2019",
                            "2018",
                            "2017",
                            "2016",
                            "2015",
                            "2014",
                            "2013",
                            "2012"
                        ],
                        "type": "string",
                        "description": "Filter by LCA submission year. Pick a single fiscal year or 'All Years' for the full history. Empty employer + 'All Years' usually returns no rows, narrow with at least one other filter.",
                        "default": "2024"
                    },
                    "includeDetails": {
                        "title": "Include full case details",
                        "type": "boolean",
                        "description": "Fetch the per-case detail page for every result to enrich the record with case status, decision date, SOC code, NAICS code, employer address, employer phone, employer POC, and the full workforce position breakdown. Disable for a faster listing-only scrape.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
