Bulk Address Parser & Normalizer (US / CA)
Pricing
Pay per event
Bulk Address Parser & Normalizer (US / CA)
Free-form addresses in, parsed {street, city, state, zip, country} out. 16 patterns cover US, Canada, Cayman. PO Box / unit / suffix detection. Optional OpenStreetMap geocode adds lat/lon. Optional phone normaliser. Built for sales-ops, CRM cleaning, lead enrichment, dataset normalisation.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
1
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Bulk Address Parser & Normalizer (US / CA / Cayman)
Parse free-form address strings into structured {street, city, state, zip, country} records. Sixteen parse patterns cover US, Canadian, and Cayman addresses, with optional Nominatim geocode and embedded phone normalisation.
Address Parser Features
- Sixteen parse patterns — US standard, no-comma, multi-location, state-name, state-code, PO Box, unit prefix / suffix, suite, directional, Canadian standard, Canadian postal, Cayman, flex-zip, and a regex fallback.
- State helpers — full name to two-letter code to URL slug, both directions.
- PO Box, unit, suite, and directional detection out of the box.
- Optional OpenStreetMap Nominatim geocode adds
{lat, lon, displayName}. Self-host the endpoint when you need more than 1 req/sec. - Optional phone normaliser detects an embedded phone number and emits it in canonical form.
- Pure CPU on the parse path. Geocoded rows trigger a separate premium event so you only pay for what you geocode.
Who Uses Address Parser Data?
- CRM / sales-ops teams — normalise free-form addresses before deduping. Catches the records that look identical until you read them carefully.
- Lead enrichment pipelines — promote text into typed fields:
state,stateName, lat/lon, normalised phone. - Dataset prep engineers — turn scraped seller, agent, or vendor blobs into clean structured rows for warehouse loads.
- Form validation backends — run user-typed addresses through real-world parser logic instead of a regex you'll regret.
- Real estate and logistics — normalise property addresses across MLS exports, county records, and broker CSVs.
How Address Parser Works
- Pass in a list of free-form address strings. Country defaults to US; pass
defaultCountryfor CA or KY. - Each string runs through the AddressManager pattern ladder. The first pattern that matches wins; the matched label lands in
patternMatched. - If
includePhoneis on, the actor also scans the raw blob for a phone-shaped substring and normalises it. - If
geocodeis on and the parse succeeded, the actor hits Nominatim (1 req/sec by default) and addslat,lon, anddisplayName.
Input
{"addresses": ["123 Main St, Springfield, IL 62701","500 University Ave, Toronto, ON M5G 1V7","PO Box 1234, George Town, KY1-1107"],"defaultCountry": "US","geocode": false,"returnUnparseable": true,"includePhone": false,"maxItems": 15}
| Field | Type | Default | Description |
|---|---|---|---|
addresses | array | required | Free-form address strings to parse and normalise. |
defaultCountry | enum | US | Country fallback when the parser cannot detect from input. US, CA, or KY. |
geocode | boolean | false | Enable Nominatim lookup. Adds 1 req/sec rate limit. Premium event when a hit lands. |
returnUnparseable | boolean | true | Include rows that failed to parse. When false, only valid=true rows are emitted. |
includePhone | boolean | false | Detect and normalise embedded phone numbers, emit phoneNormalized. |
nominatimEndpoint | string | OSM default | BYO Nominatim host. Required when geocode=true and you need more than 1 req/sec. |
maxItems | integer | 15 | Hard cap on addresses processed per run. |
Geocode + phone example
{"addresses": ["Acme Corp, 123 Main St, Suite 100, Springfield, IL 62701, (415) 555-1234"],"defaultCountry": "US","geocode": true,"includePhone": true,"maxItems": 10}
Address Parser Output Fields
{"raw": "123 Main St, Suite 100, Springfield, IL 62701","parsed": {"street": "123 Main St Suite 100","city": "Springfield","state": "IL","stateName": "Illinois","zip": "62701","country": "US"},"valid": true,"patternMatched": "us-multi-location","geo": "{\"lat\":39.7817,\"lon\":-89.6501,\"displayName\":\"Springfield, ...\"}","phoneNormalized": null,"country": "US","normalizedAt": "2026-04-30T12:00:00Z","status": "success","errorMsg": null}
| Field | Type | Description |
|---|---|---|
raw | string | The original input address string. |
parsed | object | {street, city, state, stateName, zip, country}. Null when valid=false. |
valid | boolean | True when the minimum required fields (city, state, zip) were parsed. |
patternMatched | string | Which AddressManager pattern fired (e.g. us-standard, canadian-postal, fallback). |
geo | string | JSON string {lat, lon, displayName} when geocode=true; null otherwise. |
phoneNormalized | string | Normalised phone (when includePhone=true and a number is present). |
country | string | ISO2 country code (US, CA, KY). |
normalizedAt | string | ISO timestamp when the row was processed. |
status | string | success, unparseable, or error. |
errorMsg | string | Error message when status=error; null on success. |
Pricing
Two events. Pure-CPU parses are cheap. Geocoded rows trigger a separate premium event because Nominatim adds an HTTP round-trip per record.
| Event | Price |
|---|---|
| Actor start | $0.10 |
| Per parsed address | $0.0005 |
| Per geocoded address | $0.001 |
| Volume | No geocode | Geocoded |
|---|---|---|
| 100 addresses | $0.15 | $0.20 |
| 1,000 addresses | $0.60 | $1.10 |
| 10,000 addresses | $5.10 | $10.10 |
Limits
maxItemscaps the number of addresses processed per run. Override the schema default of 15 for production batches.- The Apify console tester has a 5-minute timeout — pure-CPU parses are well clear of that, but geocode mode is rate-limited.
- Nominatim's public endpoint enforces 1 req/sec. Geocode mode therefore caps at roughly 3,500 addresses per 1-hour run on the default endpoint. Self-host or BYO via
nominatimEndpointfor higher throughput. - Country detection covers US, Canada, and Cayman. Other ISO regions fall through to the regex fallback and may emit
valid=false. - Phone normaliser is best-effort — it expects North American formats. Numbers that don't match the regex are silently skipped.
Related Actors
- DNS Domain Audit — pair with address parser when enriching contact records that include both addresses and email domains.
- Structured Data Validator Pro — for parsing addresses out of HTML before normalising them.
- SSL & Security Headers Checker — same utility-actor shape for site-health workflows.
Need More Features?
Need extra countries, alternate state-helper outputs, or a different geocode backend? File an issue or get in touch.
Why Use Address Parser?
- Cheap on the hot path — $0.0005 per parsed row. Cleaning a million-record CRM costs less than the meeting where you'd discuss it.
- Sixteen patterns, one row out — the parser handles the realistic mess of US / Canadian / Cayman addresses and tells you which pattern fired, so unparseable rows are easy to triage.
- Geocode is opt-in and pay-per-hit — Nominatim only fires when you ask for it, and only successful geocodes bill at the premium rate.
Built by OrbTop.