Bulk Address Parser & Normalizer (US / CA) avatar

Bulk Address Parser & Normalizer (US / CA)

Pricing

Pay per event

Go to Apify Store
Bulk Address Parser & Normalizer (US / CA)

Bulk Address Parser & Normalizer (US / CA)

Free-form addresses in, parsed {street, city, state, zip, country} out. 16 patterns cover US, Canada, Cayman. PO Box / unit / suffix detection. Optional OpenStreetMap geocode adds lat/lon. Optional phone normaliser. Built for sales-ops, CRM cleaning, lead enrichment, dataset normalisation.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 days ago

Last modified

Share

Bulk Address Parser, Normalizer & Validation API (US / CA / Cayman)

Parse free-form address strings into structured {street, city, state, zip, country} records — an address parser and normalizer API for bulk address standardization and validation. Sixteen parse patterns cover US, Canadian, and Cayman addresses, with optional Nominatim geocode and embedded phone normalisation.


Address Parser Features

  • Sixteen parse patterns — US standard, no-comma, multi-location, state-name, state-code, PO Box, unit prefix / suffix, suite, directional, Canadian standard, Canadian postal, Cayman, flex-zip, and a regex fallback.
  • State helpers — full name to two-letter code to URL slug, both directions.
  • PO Box, unit, suite, and directional detection out of the box.
  • Optional OpenStreetMap Nominatim geocode adds {lat, lon, displayName}. Self-host the endpoint when you need more than 1 req/sec.
  • Optional phone normaliser detects an embedded phone number and emits it in canonical form.
  • Pure CPU on the parse path. Geocoded rows trigger a separate premium event so you only pay for what you geocode.

Who Uses Address Parser & Validation Data?

  • CRM / sales-ops teams — normalise free-form addresses before deduping. Catches the records that look identical until you read them carefully.
  • Lead enrichment pipelines — promote text into typed fields: state, stateName, lat/lon, normalised phone.
  • Dataset prep engineers — turn scraped seller, agent, or vendor blobs into clean structured rows for warehouse loads.
  • Form validation backends — run user-typed addresses through real-world parser logic instead of a regex you'll regret.
  • Real estate and logistics — normalise property addresses across MLS exports, county records, and broker CSVs.

How Address Parser Works

  1. Pass in a list of free-form address strings. Country defaults to US; pass defaultCountry for CA or KY.
  2. Each string runs through the AddressManager pattern ladder. The first pattern that matches wins; the matched label lands in patternMatched.
  3. If includePhone is on, the actor also scans the raw blob for a phone-shaped substring and normalises it.
  4. If geocode is on and the parse succeeded, the actor hits Nominatim (1 req/sec by default) and adds lat, lon, and displayName.

Input

{
"addresses": [
"123 Main St, Springfield, IL 62701",
"500 University Ave, Toronto, ON M5G 1V7",
"PO Box 1234, George Town, KY1-1107"
],
"defaultCountry": "US",
"geocode": false,
"returnUnparseable": true,
"includePhone": false,
"maxItems": 15
}
FieldTypeDefaultDescription
addressesarrayrequiredFree-form address strings to parse and normalise.
defaultCountryenumUSCountry fallback when the parser cannot detect from input. US, CA, or KY.
geocodebooleanfalseEnable Nominatim lookup. Adds 1 req/sec rate limit. Premium event when a hit lands.
returnUnparseablebooleantrueInclude rows that failed to parse. When false, only valid=true rows are emitted.
includePhonebooleanfalseDetect and normalise embedded phone numbers, emit phoneNormalized.
nominatimEndpointstringOSM defaultBYO Nominatim host. Required when geocode=true and you need more than 1 req/sec.
maxItemsinteger15Hard cap on addresses processed per run.

Geocode + phone example

{
"addresses": ["Acme Corp, 123 Main St, Suite 100, Springfield, IL 62701, (415) 555-1234"],
"defaultCountry": "US",
"geocode": true,
"includePhone": true,
"maxItems": 10
}

Address Parser Output Fields

{
"raw": "123 Main St, Suite 100, Springfield, IL 62701",
"parsed": {
"street": "123 Main St Suite 100",
"city": "Springfield",
"state": "IL",
"stateName": "Illinois",
"zip": "62701",
"country": "US"
},
"valid": true,
"patternMatched": "us-multi-location",
"geo": "{\"lat\":39.7817,\"lon\":-89.6501,\"displayName\":\"Springfield, ...\"}",
"phoneNormalized": null,
"country": "US",
"normalizedAt": "2026-04-30T12:00:00Z",
"status": "success",
"errorMsg": null
}
FieldTypeDescription
rawstringThe original input address string.
parsedobject{street, city, state, stateName, zip, country}. Null when valid=false.
validbooleanTrue when the minimum required fields (city, state, zip) were parsed.
patternMatchedstringWhich AddressManager pattern fired (e.g. us-standard, canadian-postal, fallback).
geostringJSON string {lat, lon, displayName} when geocode=true; null otherwise.
phoneNormalizedstringNormalised phone (when includePhone=true and a number is present).
countrystringISO2 country code (US, CA, KY).
normalizedAtstringISO timestamp when the row was processed.
statusstringsuccess, unparseable, or error.
errorMsgstringError message when status=error; null on success.

Pricing

Two events. Pure-CPU parses are cheap. Geocoded rows trigger a separate premium event because Nominatim adds an HTTP round-trip per record.

EventPrice
Actor start$0.10
Per parsed address$0.0005
Per geocoded address$0.001
VolumeNo geocodeGeocoded
100 addresses$0.15$0.20
1,000 addresses$0.60$1.10
10,000 addresses$5.10$10.10

Limits

  • maxItems caps the number of addresses processed per run. Override the schema default of 15 for production batches.
  • The Apify console tester has a 5-minute timeout — pure-CPU parses are well clear of that, but geocode mode is rate-limited.
  • Nominatim's public endpoint enforces 1 req/sec. Geocode mode therefore caps at roughly 3,500 addresses per 1-hour run on the default endpoint. Self-host or BYO via nominatimEndpoint for higher throughput.
  • Country detection covers US, Canada, and Cayman. Other ISO regions fall through to the regex fallback and may emit valid=false.
  • Phone normaliser is best-effort — it expects North American formats. Numbers that don't match the regex are silently skipped.

FAQ

How do I parse a free-form address string into structured fields? Pass your raw strings in the addresses array. Each one runs through the sixteen-pattern AddressManager ladder and comes back as {street, city, state, stateName, zip, country} plus a patternMatched label.

Is there an API to bulk normalize US and Canadian addresses? Yes. This actor is a bulk address normalizer for US, Canadian, and Cayman addresses, callable via the Apify API or run on a schedule. It accepts a list and emits one structured row per input.

Can I validate addresses and convert state names to two-letter codes? The valid flag reports whether city, state, and zip parsed. State helpers convert full names to two-letter codes to URL slugs in both directions, so Illinois becomes IL and back.

How do I geocode addresses to latitude and longitude? Set geocode to true. Parsed rows are sent to OpenStreetMap Nominatim and come back with lat, lon, and displayName. Self-host the endpoint via nominatimEndpoint for more than 1 req/sec.


  • DNS Domain Audit — pair with address parser when enriching contact records that include both addresses and email domains.
  • Structured Data Validator Pro — for parsing addresses out of HTML before normalising them.
  • SSL & Security Headers Checker — same utility-actor shape for site-health workflows.

Need More Features?

Need extra countries, alternate state-helper outputs, or a different geocode backend? File an issue or get in touch.

Why Use Address Parser?

  • Cheap on the hot path — $0.0005 per parsed row. Cleaning a million-record CRM costs less than the meeting where you'd discuss it.
  • Sixteen patterns, one row out — the parser handles the realistic mess of US / Canadian / Cayman addresses and tells you which pattern fired, so unparseable rows are easy to triage.
  • Geocode is opt-in and pay-per-hit — Nominatim only fires when you ask for it, and only successful geocodes bill at the premium rate.

Built by OrbTop.