NMLS Scraper — Mortgage Loan Originators, Lenders & Branches avatar

NMLS Scraper — Mortgage Loan Originators, Lenders & Branches

Pricing

$1.00 / 1,000 results

Go to Apify Store
NMLS Scraper — Mortgage Loan Originators, Lenders & Branches

NMLS Scraper — Mortgage Loan Originators, Lenders & Branches

Pull MLO, lender, and branch records from NMLS Consumer Access. Auditable open-source actor with no survey-form gate. Returns one normalized row per entity with state licenses, disclosures, sponsor links, and computed risk-rollup flags.

Pricing

$1.00 / 1,000 results

Rating

0.0

(0)

Developer

Tony

Tony

Maintained by Community

Actor stats

0

Bookmarked

12

Total users

3

Monthly active users

a day ago

Last modified

Categories

Share

The open-source, audit-ready NMLS Consumer Access scraper — no mandatory intake survey, no hidden source. Pull mortgage loan originator (MLO), lender, and branch records — state licenses, federal registrations, regulator disclosures, sponsor relationships, contact info, and computed risk-rollup flags — by NMLS ID or by name. Open source, agent-ready, one normalized row per entity.

Use it to verify mortgage license status, build licensee lists for recruiting and sales, run compliance checks against the NMLS database, and enrich your mortgage CRM with structured Nationwide Multistate Licensing System data.

Why this exists

NMLS Consumer Access is CAPTCHA-protected behind Cloudflare, so there's no turnkey way to get its data into a warehouse or CRM. This actor wraps it as a clean, structured feed — and does it transparently. The source is open and auditable, there's no mandatory intake survey standing between you and a run, and every entity comes back as one normalized row with 25+ fields plus computed risk-rollup flags. If you need a mortgage-license data source your compliance team can actually inspect, that's the point.

Who this is for

  • Mortgage recruiters building outreach lists of MLOs by state, sponsoring company, or license status
  • Wholesale lender BD teams mapping broker territories and identifying licensed counterparties in target states
  • Compliance and risk onboarding teams screening MLO and lender counterparties for active regulatory actions, license status, and disclosure history before approval
  • Mortgage CRMs and fintech integrators enriching contact records with verified NMLS IDs, multi-state license coverage, sponsor relationships, and regulator disclosures
  • Market researchers sizing the mortgage origination market by state, license type, sponsor, or branch footprint

What you get

One structured JSON record per NMLS entity (MLO, mortgage company, or branch), written to the default Apify Dataset. Re-runs deduplicate cleanly on the id field (which equals nmls_id), so you can upsert into your warehouse without extra work.

Example input

{
"mode": "by_nmls_id",
"nmlsIds": ["3030"],
"includeDisclosures": true,
"maxItems": 100,
"proxyConfiguration": { "useApifyProxy": true }
}

Example output (one row)

{
"id": "3030",
"nmls_id": "3030",
"entity_type": "COMPANY",
"legal_name": "Rocket Mortgage, LLC",
"other_trade_names": [
"Champion Mortgage", "Champion Mortgage Company", "Rocket",
"Rocket HQ", "Rocket Mortgage", "Rocket Pro", "Rocket Professional",
"Rushmore Servicing", "Rushmore Servicing Group"
],
"primary_address": {
"street": "1050 Woodward Avenue",
"city": "Detroit",
"state": "MI",
"zip": "48226",
"country": "US"
},
"mailing_address": {
"street": "1050 Woodward Avenue",
"city": "Detroit",
"state": "MI",
"zip": "48226",
"country": "US"
},
"state_licenses": [
{
"state": "Alabama",
"regulator": "Alabama",
"license_name": "Consumer Credit License",
"license_number": "20979",
"status": "Approved",
"issue_date": "2009-11-10",
"original_issue_date": "2009-11-10"
}
],
"federal_registrations": [],
"sponsorships": null,
"sponsored_individuals_count": 3705,
"branch_count": 83,
"employment_history": null,
"disclosures": [
{
"date": "2021-09-22",
"regulator": "Alabama",
"action_type": "Order - Settlement Agreement and Order",
"description": "Docket MC-2018-04; Multi-state ID M104096",
"length_in_days": null,
"settled": true
}
],
"has_active_regulatory_action": true,
"total_state_licenses_active": 165,
"contact_email": "CompanyLicensing@rocketmortgage.com",
"source_url": "https://www.nmlsconsumeraccess.org/EntityDetails.aspx/COMPANY/3030",
"scraped_at": "2026-05-03T03:44:31.340Z"
}

Field notes

  • id is always equal to nmls_id. Use it as your upsert key.
  • entity_type is one of INDIVIDUAL, COMPANY, or BRANCH. The actor auto-detects when not pinned.
  • state_licenses is one row per state-regulator pairing. An entity licensed in 51 jurisdictions returns 51 rows here.
  • sponsorships is populated for INDIVIDUAL records (current sponsoring company); null otherwise.
  • sponsored_individuals_count and branch_count are populated for COMPANY records; null otherwise.
  • has_active_regulatory_action and total_state_licenses_active are computed rollups — visible at-a-glance in the dataset preview without an ETL step on your side.
  • contact_email is decoded from Cloudflare's data-cfemail obfuscation.
  • All timestamps are ISO-8601 UTC.

Inputs

FieldTypeDefaultNotes
modeenumby_nmls_idby_nmls_id or by_name
nmlsIdsstring[]["3030"]Used when mode=by_nmls_id
namesstring[]["Rocket Mortgage"]Used when mode=by_name
entityTypeenum"""" (all), INDIVIDUAL, COMPANY, BRANCH
includeDisclosuresbooleantrueParse regulator action history
maxItemsint100Hard cap, max 10,000
proxyConfigurationobjectApify defaultSwitch to RESIDENTIAL if you see Cloudflare blocks
maxConcurrencyint4Polite default

CAPTCHA handling

NMLS Consumer Access uses BotDetect CAPTCHA at the session-warmup step. This is handled for you — you don't need a CAPTCHA key or any extra setup. The actor solves it once per run on its own solver, then reuses the session for every subsequent record, so the cost amortizes to fractions of a cent per record and is already built into the per-run pricing below.

The solver is provider-agnostic (src/solver/index.ts), so if you fork the open source you can swap in CapSolver, 2Captcha, Anti-Captcha, or a self-hosted OCR by pointing the CAPSOLVER_API_KEY env var (or a one-line solver swap) at your own account.

Pricing

Pay-per-event:

  • Actor Start~$0.08 per run — tiered by your Apify plan ($0.082 on Free, as low as $0.07 on higher plans). Billed once per GB of run memory, and the actor runs at 1 GB by default, so it's a single start charge. Covers session warmup: the BotDetect CAPTCHA solve and the residential-proxy handshake against Cloudflare.
  • result$0.001 per entity record saved to the dataset (this is the primary event)

A few worked examples (at the ~$0.08 start tier) so you can budget:

Records returnedYou pay
1 (single MLO lookup)~$0.081
100 (recruit / sales list)~$0.18
1,000 (territory map)~$1.08
10,000 (quarterly compliance sweep)~$10.08

Per-record charges only fire when records actually save to the dataset — so a search that legitimately matches nothing only incurs the one-time start fee for the warmup work, not per-record fees.

v1.1 roadmap (additive — no breaking changes): a Compliance event ($0.005/record) for parsed disclosures + cross-state risk score; a Bulk discount tier above 10K records in a single run; a Standby event ($0.01/call) for synchronous single-record lookups via the Apify Standby endpoint.

Limitations

  • v1 supports by_nmls_id and by_name. by_state, by_company_employees, and by_license_status ship in v1.1.
  • All three entity types are supported (COMPANY, INDIVIDUAL, BRANCH). For INDIVIDUAL records, sponsorships and federal_registrations are populated when present; employment_history is reconstructed from per-license sponsorship history and aggregated to one entry per unique employer.
  • Disclosures are parsed from the public regulator-actions table. Length-in-days and full action descriptions live behind a "View Details" expansion not currently scraped (v1.1).
  • The portal updates nightly on business days — fresher data than that requires hitting NMLS directly.

FAQ

Is there a public NMLS API? No — NMLS does not publish a free public API for company, branch, or MLO records. NMLS Consumer Access (the public web portal at nmlsconsumeraccess.org) is the only public source, and it's CAPTCHA-protected behind a Cloudflare-fronted Turing Test page. This actor wraps the portal as a stable, structured pay-per-record data feed.

What data does NMLS Consumer Access include? Mortgage Loan Originators (~500K active), mortgage companies (~40K), and branches (~125K). Per entity: identity, current and historical legal names, state licenses (regulator, license number, status, dates), federal registrations for depository-institution MLOs, regulator disclosures, sponsor relationships, employment history reconstructed from per-license sponsorships, branch counts, sponsored-MLO counts, and contact info.

How do I verify a mortgage loan originator's license? Pass the MLO's NMLS ID with mode=by_nmls_id. The returned record includes every state license they hold (state, license number, status, issue date), any federal registration, current sponsoring company, and the count and details of any regulator disclosures.

Can I bulk-verify mortgage lender licenses? Yes — pass an array of NMLS IDs in nmlsIds (up to 10,000 per run) and the actor returns one normalized record per ID. Compliance teams use this for quarterly renewal sweeps and pre-onboarding screens.

How do I find an MLO if I don't know their NMLS ID? Use mode=by_name. The actor calls the same JSON search API the NMLS frontend uses and auto-iterates COMPANY and INDIVIDUAL entity types if you don't pin one. For very common names, narrow with entityType to avoid the portal's "too many results" cap.

Why is the source open? Compliance and risk teams audit their data sources. An auditable scraper sells better — and lets you fork it if you ever need a custom field added.

Can I run on a schedule? Yes — Apify Schedules. Recommended cadence: weekly for license-renewal monitoring, monthly for full portfolio refreshes, on-demand for fraud / pre-onboarding lookups.

How does this handle the Cloudflare block and BotDetect CAPTCHA? One CAPTCHA solve at session warmup, then session cookies carry the rest of the run. Solving is handled for you and already priced into the per-run cost — no key or setup on your side. (Under the hood the default solver is CapSolver; if you fork the open source, Two-Captcha and Anti-Captcha are one-line drop-in alternatives.)

MCP / agentic support? Planned for v1.2. The actor's input shape is already designed to be MCP-packaged with no breaking changes — AI agents will be able to call it as a tool for live MLO verification.

Running locally

npm install
npm run typecheck
npm run start:dev

Set CAPSOLVER_API_KEY in your environment for the dev loop (e.g. CAPSOLVER_API_KEY=… npm run start:dev). Run apify push to deploy. On the published actor the key lives in Source → Environment variables as a secret, so production runs solve on the owner's account.

License

MIT.