Bulk Address Parser & Normalizer (US / CA) avatar

Bulk Address Parser & Normalizer (US / CA)

Pricing

Pay per event

Go to Apify Store
Bulk Address Parser & Normalizer (US / CA)

Bulk Address Parser & Normalizer (US / CA)

Free-form addresses in, parsed {street, city, state, zip, country} out. 16 patterns cover US, Canada, Cayman. PO Box / unit / suffix detection. Optional OpenStreetMap geocode adds lat/lon. Optional phone normaliser. Built for sales-ops, CRM cleaning, lead enrichment, dataset normalisation.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

2 days ago

Last modified

Share

Bulk Address Parser & Normalizer (US / CA / Cayman)

Parse free-form address strings into structured {street, city, state, zip, country} records. Sixteen parse patterns cover US, Canadian, and Cayman addresses, with optional Nominatim geocode and embedded phone normalisation.


Address Parser Features

  • Sixteen parse patterns — US standard, no-comma, multi-location, state-name, state-code, PO Box, unit prefix / suffix, suite, directional, Canadian standard, Canadian postal, Cayman, flex-zip, and a regex fallback.
  • State helpers — full name to two-letter code to URL slug, both directions.
  • PO Box, unit, suite, and directional detection out of the box.
  • Optional OpenStreetMap Nominatim geocode adds {lat, lon, displayName}. Self-host the endpoint when you need more than 1 req/sec.
  • Optional phone normaliser detects an embedded phone number and emits it in canonical form.
  • Pure CPU on the parse path. Geocoded rows trigger a separate premium event so you only pay for what you geocode.

Who Uses Address Parser Data?

  • CRM / sales-ops teams — normalise free-form addresses before deduping. Catches the records that look identical until you read them carefully.
  • Lead enrichment pipelines — promote text into typed fields: state, stateName, lat/lon, normalised phone.
  • Dataset prep engineers — turn scraped seller, agent, or vendor blobs into clean structured rows for warehouse loads.
  • Form validation backends — run user-typed addresses through real-world parser logic instead of a regex you'll regret.
  • Real estate and logistics — normalise property addresses across MLS exports, county records, and broker CSVs.

How Address Parser Works

  1. Pass in a list of free-form address strings. Country defaults to US; pass defaultCountry for CA or KY.
  2. Each string runs through the AddressManager pattern ladder. The first pattern that matches wins; the matched label lands in patternMatched.
  3. If includePhone is on, the actor also scans the raw blob for a phone-shaped substring and normalises it.
  4. If geocode is on and the parse succeeded, the actor hits Nominatim (1 req/sec by default) and adds lat, lon, and displayName.

Input

{
"addresses": [
"123 Main St, Springfield, IL 62701",
"500 University Ave, Toronto, ON M5G 1V7",
"PO Box 1234, George Town, KY1-1107"
],
"defaultCountry": "US",
"geocode": false,
"returnUnparseable": true,
"includePhone": false,
"maxItems": 15
}
FieldTypeDefaultDescription
addressesarrayrequiredFree-form address strings to parse and normalise.
defaultCountryenumUSCountry fallback when the parser cannot detect from input. US, CA, or KY.
geocodebooleanfalseEnable Nominatim lookup. Adds 1 req/sec rate limit. Premium event when a hit lands.
returnUnparseablebooleantrueInclude rows that failed to parse. When false, only valid=true rows are emitted.
includePhonebooleanfalseDetect and normalise embedded phone numbers, emit phoneNormalized.
nominatimEndpointstringOSM defaultBYO Nominatim host. Required when geocode=true and you need more than 1 req/sec.
maxItemsinteger15Hard cap on addresses processed per run.

Geocode + phone example

{
"addresses": ["Acme Corp, 123 Main St, Suite 100, Springfield, IL 62701, (415) 555-1234"],
"defaultCountry": "US",
"geocode": true,
"includePhone": true,
"maxItems": 10
}

Address Parser Output Fields

{
"raw": "123 Main St, Suite 100, Springfield, IL 62701",
"parsed": {
"street": "123 Main St Suite 100",
"city": "Springfield",
"state": "IL",
"stateName": "Illinois",
"zip": "62701",
"country": "US"
},
"valid": true,
"patternMatched": "us-multi-location",
"geo": "{\"lat\":39.7817,\"lon\":-89.6501,\"displayName\":\"Springfield, ...\"}",
"phoneNormalized": null,
"country": "US",
"normalizedAt": "2026-04-30T12:00:00Z",
"status": "success",
"errorMsg": null
}
FieldTypeDescription
rawstringThe original input address string.
parsedobject{street, city, state, stateName, zip, country}. Null when valid=false.
validbooleanTrue when the minimum required fields (city, state, zip) were parsed.
patternMatchedstringWhich AddressManager pattern fired (e.g. us-standard, canadian-postal, fallback).
geostringJSON string {lat, lon, displayName} when geocode=true; null otherwise.
phoneNormalizedstringNormalised phone (when includePhone=true and a number is present).
countrystringISO2 country code (US, CA, KY).
normalizedAtstringISO timestamp when the row was processed.
statusstringsuccess, unparseable, or error.
errorMsgstringError message when status=error; null on success.

Pricing

Two events. Pure-CPU parses are cheap. Geocoded rows trigger a separate premium event because Nominatim adds an HTTP round-trip per record.

EventPrice
Actor start$0.10
Per parsed address$0.0005
Per geocoded address$0.001
VolumeNo geocodeGeocoded
100 addresses$0.15$0.20
1,000 addresses$0.60$1.10
10,000 addresses$5.10$10.10

Limits

  • maxItems caps the number of addresses processed per run. Override the schema default of 15 for production batches.
  • The Apify console tester has a 5-minute timeout — pure-CPU parses are well clear of that, but geocode mode is rate-limited.
  • Nominatim's public endpoint enforces 1 req/sec. Geocode mode therefore caps at roughly 3,500 addresses per 1-hour run on the default endpoint. Self-host or BYO via nominatimEndpoint for higher throughput.
  • Country detection covers US, Canada, and Cayman. Other ISO regions fall through to the regex fallback and may emit valid=false.
  • Phone normaliser is best-effort — it expects North American formats. Numbers that don't match the regex are silently skipped.

  • DNS Domain Audit — pair with address parser when enriching contact records that include both addresses and email domains.
  • Structured Data Validator Pro — for parsing addresses out of HTML before normalising them.
  • SSL & Security Headers Checker — same utility-actor shape for site-health workflows.

Need More Features?

Need extra countries, alternate state-helper outputs, or a different geocode backend? File an issue or get in touch.

Why Use Address Parser?

  • Cheap on the hot path — $0.0005 per parsed row. Cleaning a million-record CRM costs less than the meeting where you'd discuss it.
  • Sixteen patterns, one row out — the parser handles the realistic mess of US / Canadian / Cayman addresses and tells you which pattern fired, so unparseable rows are easy to triage.
  • Geocode is opt-in and pay-per-hit — Nominatim only fires when you ask for it, and only successful geocodes bill at the premium rate.

Built by OrbTop.