Bulk Address Parser & Normalizer (US / CA)
Pricing
Pay per event
Bulk Address Parser & Normalizer (US / CA)
Free-form addresses in, parsed {street, city, state, zip, country} out. 16 patterns cover US, Canada, Cayman. PO Box / unit / suffix detection. Optional OpenStreetMap geocode adds lat/lon. Optional phone normaliser. Built for sales-ops, CRM cleaning, lead enrichment, dataset normalisation.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 days ago
Last modified
Categories
Share
Bulk Address Parser, Normalizer & Validation API (US / CA / Cayman)
Parse free-form address strings into structured {street, city, state, zip, country} records — an address parser and normalizer API for bulk address standardization and validation. Sixteen parse patterns cover US, Canadian, and Cayman addresses, with optional Nominatim geocode and embedded phone normalisation.
Address Parser Features
- Sixteen parse patterns — US standard, no-comma, multi-location, state-name, state-code, PO Box, unit prefix / suffix, suite, directional, Canadian standard, Canadian postal, Cayman, flex-zip, and a regex fallback.
- State helpers — full name to two-letter code to URL slug, both directions.
- PO Box, unit, suite, and directional detection out of the box.
- Optional OpenStreetMap Nominatim geocode adds
{lat, lon, displayName}. Self-host the endpoint when you need more than 1 req/sec. - Optional phone normaliser detects an embedded phone number and emits it in canonical form.
- Pure CPU on the parse path. Geocoded rows trigger a separate premium event so you only pay for what you geocode.
Who Uses Address Parser & Validation Data?
- CRM / sales-ops teams — normalise free-form addresses before deduping. Catches the records that look identical until you read them carefully.
- Lead enrichment pipelines — promote text into typed fields:
state,stateName, lat/lon, normalised phone. - Dataset prep engineers — turn scraped seller, agent, or vendor blobs into clean structured rows for warehouse loads.
- Form validation backends — run user-typed addresses through real-world parser logic instead of a regex you'll regret.
- Real estate and logistics — normalise property addresses across MLS exports, county records, and broker CSVs.
How Address Parser Works
- Pass in a list of free-form address strings. Country defaults to US; pass
defaultCountryfor CA or KY. - Each string runs through the AddressManager pattern ladder. The first pattern that matches wins; the matched label lands in
patternMatched. - If
includePhoneis on, the actor also scans the raw blob for a phone-shaped substring and normalises it. - If
geocodeis on and the parse succeeded, the actor hits Nominatim (1 req/sec by default) and addslat,lon, anddisplayName.
Input
{"addresses": ["123 Main St, Springfield, IL 62701","500 University Ave, Toronto, ON M5G 1V7","PO Box 1234, George Town, KY1-1107"],"defaultCountry": "US","geocode": false,"returnUnparseable": true,"includePhone": false,"maxItems": 15}
| Field | Type | Default | Description |
|---|---|---|---|
addresses | array | required | Free-form address strings to parse and normalise. |
defaultCountry | enum | US | Country fallback when the parser cannot detect from input. US, CA, or KY. |
geocode | boolean | false | Enable Nominatim lookup. Adds 1 req/sec rate limit. Premium event when a hit lands. |
returnUnparseable | boolean | true | Include rows that failed to parse. When false, only valid=true rows are emitted. |
includePhone | boolean | false | Detect and normalise embedded phone numbers, emit phoneNormalized. |
nominatimEndpoint | string | OSM default | BYO Nominatim host. Required when geocode=true and you need more than 1 req/sec. |
maxItems | integer | 15 | Hard cap on addresses processed per run. |
Geocode + phone example
{"addresses": ["Acme Corp, 123 Main St, Suite 100, Springfield, IL 62701, (415) 555-1234"],"defaultCountry": "US","geocode": true,"includePhone": true,"maxItems": 10}
Address Parser Output Fields
{"raw": "123 Main St, Suite 100, Springfield, IL 62701","parsed": {"street": "123 Main St Suite 100","city": "Springfield","state": "IL","stateName": "Illinois","zip": "62701","country": "US"},"valid": true,"patternMatched": "us-multi-location","geo": "{\"lat\":39.7817,\"lon\":-89.6501,\"displayName\":\"Springfield, ...\"}","phoneNormalized": null,"country": "US","normalizedAt": "2026-04-30T12:00:00Z","status": "success","errorMsg": null}
| Field | Type | Description |
|---|---|---|
raw | string | The original input address string. |
parsed | object | {street, city, state, stateName, zip, country}. Null when valid=false. |
valid | boolean | True when the minimum required fields (city, state, zip) were parsed. |
patternMatched | string | Which AddressManager pattern fired (e.g. us-standard, canadian-postal, fallback). |
geo | string | JSON string {lat, lon, displayName} when geocode=true; null otherwise. |
phoneNormalized | string | Normalised phone (when includePhone=true and a number is present). |
country | string | ISO2 country code (US, CA, KY). |
normalizedAt | string | ISO timestamp when the row was processed. |
status | string | success, unparseable, or error. |
errorMsg | string | Error message when status=error; null on success. |
Pricing
Two events. Pure-CPU parses are cheap. Geocoded rows trigger a separate premium event because Nominatim adds an HTTP round-trip per record.
| Event | Price |
|---|---|
| Actor start | $0.10 |
| Per parsed address | $0.0005 |
| Per geocoded address | $0.001 |
| Volume | No geocode | Geocoded |
|---|---|---|
| 100 addresses | $0.15 | $0.20 |
| 1,000 addresses | $0.60 | $1.10 |
| 10,000 addresses | $5.10 | $10.10 |
Limits
maxItemscaps the number of addresses processed per run. Override the schema default of 15 for production batches.- The Apify console tester has a 5-minute timeout — pure-CPU parses are well clear of that, but geocode mode is rate-limited.
- Nominatim's public endpoint enforces 1 req/sec. Geocode mode therefore caps at roughly 3,500 addresses per 1-hour run on the default endpoint. Self-host or BYO via
nominatimEndpointfor higher throughput. - Country detection covers US, Canada, and Cayman. Other ISO regions fall through to the regex fallback and may emit
valid=false. - Phone normaliser is best-effort — it expects North American formats. Numbers that don't match the regex are silently skipped.
FAQ
How do I parse a free-form address string into structured fields?
Pass your raw strings in the addresses array. Each one runs through the sixteen-pattern AddressManager ladder and comes back as {street, city, state, stateName, zip, country} plus a patternMatched label.
Is there an API to bulk normalize US and Canadian addresses? Yes. This actor is a bulk address normalizer for US, Canadian, and Cayman addresses, callable via the Apify API or run on a schedule. It accepts a list and emits one structured row per input.
Can I validate addresses and convert state names to two-letter codes?
The valid flag reports whether city, state, and zip parsed. State helpers convert full names to two-letter codes to URL slugs in both directions, so Illinois becomes IL and back.
How do I geocode addresses to latitude and longitude?
Set geocode to true. Parsed rows are sent to OpenStreetMap Nominatim and come back with lat, lon, and displayName. Self-host the endpoint via nominatimEndpoint for more than 1 req/sec.
Related Actors
- DNS Domain Audit — pair with address parser when enriching contact records that include both addresses and email domains.
- Structured Data Validator Pro — for parsing addresses out of HTML before normalising them.
- SSL & Security Headers Checker — same utility-actor shape for site-health workflows.
Need More Features?
Need extra countries, alternate state-helper outputs, or a different geocode backend? File an issue or get in touch.
Why Use Address Parser?
- Cheap on the hot path — $0.0005 per parsed row. Cleaning a million-record CRM costs less than the meeting where you'd discuss it.
- Sixteen patterns, one row out — the parser handles the realistic mess of US / Canadian / Cayman addresses and tells you which pattern fired, so unparseable rows are easy to triage.
- Geocode is opt-in and pay-per-hit — Nominatim only fires when you ask for it, and only successful geocodes bill at the premium rate.
Built by OrbTop.