US Building Permit Leads Scraper (Socrata)
Pricing
Pay per event
US Building Permit Leads Scraper (Socrata)
Pull fresh building and construction permit records from any US city or county Socrata open-data portal. We normalize permit ID, type, status, address, valuation, and contact info into one typed row per permit — ready to feed your contractor CRM or lead pipeline.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
🎯 What this scrapes
US municipal and county building-permit registries published via the Socrata SODA API (over 300 US portals at last count, including Chicago, New York, Seattle, Austin, Los Angeles, and dozens of smaller cities). The SODA API returns permit records as structured JSON rows — this Actor normalises the common fields that matter for contractor lead-gen and passes through every city-specific field in an extra dict, so you never lose data even when the schemas diverge.
Verified against: Chicago (data.cityofchicago.org / ydr8-5enu), with the same logic applicable to any Socrata-hosted permit dataset.
🔥 What we handle for you
- 🛡️ Proxy rotation via Apify Proxy — we absorb transient 429s and routing failures so your run completes even when a portal throttles a single IP.
- 🔁 Retries with exponential backoff on
429 / 503and network errors — up to 5 attempts per request,Retry-Afterheader honoured. - 🧊 Normalized, typed dataset rows — Pydantic-validated, ISO-8601 dates, stable field names across all city schemas, JSON / CSV / Excel export from the Apify Console.
- 🗺️ Multi-city in a single run — pass an array of
{domain, datasetId}objects and we fan out across all portals, respectingmaxItemsPerDatasetper source. - 💰 Pay-Per-Event pricing — you pay only for permit records that land in your dataset. No data, no charge (beyond the small actor-start fee).
💡 Use cases
- HVAC / roofing / solar contractor lead lists — filter by permit type (new construction, re-roof, HVAC installation) and harvest the applicant or contractor contact frame for outbound sales.
- Contractor-lead resellers — run nightly on multiple city datasets, normalize to a single schema, and deliver fresh permit leads to clients via the Apify API or webhook.
- Valuation monitoring — pull permits above a dollar threshold in a target ZIP code to watch development activity in a neighborhood.
- Construction analytics — aggregate permit counts by type, date, and district to report on building-activity trends for real-estate investors or municipalities.
- CRM enrichment pipeline — match permit applicant names or addresses against your existing contact records to surface warm leads before the competition does.
⚙️ How to use it
- Click Try for free at the top of the page.
- In the Socrata datasets field, enter the list of portal + dataset pairs you want to scrape. The default pre-fills Chicago building permits as a working example.
- Optionally set Issued after to a date (e.g.
2024-01-01) to only pull recent permits. - Set Max items per dataset —
1000is the default; use0for all available records (very large datasets can run for several minutes). - Click Start. Results stream into the run's dataset in real time.
- Export from Storage → Dataset as JSON, CSV, or Excel — or call the dataset via the Apify API.
Finding a new city's dataset ID: go to data.cityofchicago.org (or your target city's Socrata domain), search for "building permits", open the dataset, and grab the four-by-four ID from the URL (e.g. ydr8-5enu).
📥 Input
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
datasets | array | yes | Chicago example | List of {domain, datasetId, label?} objects |
issuedAfter | string | no | "" | ISO-8601 date floor on issue_date |
soqlWhere | string | no | "" | Raw SoQL $where override (overrides issuedAfter) |
maxItemsPerDataset | integer | no | 1000 | Per-dataset record cap; 0 = unlimited |
proxyConfiguration | object | no | Apify Proxy enabled | Proxy settings — Apify Proxy by default |
Example input
{"datasets": [{"domain": "data.cityofchicago.org","datasetId": "ydr8-5enu","label": "Chicago"}],"issuedAfter": "2024-01-01","maxItemsPerDataset": 1000,"proxyConfiguration": {"useApifyProxy": true}}
📤 Output
Each row is one building permit record. Common fields are normalized; city-specific extras land in extra.
| Field | Type | Notes |
|---|---|---|
source_domain | string | Portal hostname |
dataset_id | string | Socrata dataset ID |
source_label | string|null | Human label from input |
permit_id | string|null | Permit number |
permit_type | string|null | e.g. "PERMIT - NEW CONSTRUCTION" |
status | string|null | e.g. "permit issued" |
work_description | string|null | Free-text work description |
issue_date | string|null | ISO-8601 or raw string |
application_date | string|null | Application submission date |
address | string|null | Best-effort assembled full address |
street_number | string|null | Street number |
street_name | string|null | Street name |
city | string|null | City name |
state | string|null | Two-letter state code |
postal_code | string|null | ZIP code |
valuation | float|null | Project valuation in USD |
fee_paid | float|null | Building fee paid in USD |
contractor_name | string|null | Licensed contractor name |
contractor_license | string|null | Contractor license number |
applicant_name | string|null | Applicant name |
latitude | float|null | WGS-84 latitude |
longitude | float|null | WGS-84 longitude |
extra | object|null | All unmapped city-specific fields |
Example output row
{"source_domain": "data.cityofchicago.org","dataset_id": "ydr8-5enu","source_label": "Chicago","permit_id": "100987654","permit_type": "PERMIT - NEW CONSTRUCTION","status": "permit issued","work_description": "ERECT A 2-STORY SINGLE FAMILY RESIDENCE","issue_date": "2024-03-15","application_date": "2024-01-10","address": "1234 N MAIN ST, Chicago, IL 60614","street_number": "1234","street_name": "MAIN","city": "Chicago","state": "IL","postal_code": "60614","valuation": 350000.0,"fee_paid": 1850.0,"contractor_name": "ABC CONSTRUCTION LLC","contractor_license": "LIC-2024-001","applicant_name": "Jane Smith","latitude": 41.9021,"longitude": -87.6346,"extra": {"review_type": "STANDARD PLAN EXAMINATION","reported_cost": "350000"}}
💰 Pricing
$2.00 per 1 000 permit records.
A typical city dataset has 50 000–500 000 permit rows. A run pulling 10 000 records costs roughly $0.20 — cheaper than manually downloading and normalizing a CSV. You pay only for records that land in your dataset; a failed or empty run charges only the small actor-start event.
| Event | Cost |
|---|---|
| Actor start | $0.03 (one-time per run) |
| Per result emitted | $0.002 |
🚧 Limitations
- Socrata schema divergence — every city publishes slightly different column names. The Actor maps the most common field names (Chicago, Seattle, LA, etc.) but may miss aliases in less common portals. Unknown columns land in
extra. - Rate limits — Socrata portals are publicly throttled. Very large pulls (
maxItemsPerDataset=0on a 500k-row dataset) may be paced across multiple pages; expect runs of 5–15 minutes for large datasets. - Authentication not supported — all supported portals require no API key on the public endpoints. Portals behind Socrata auth tokens are not currently supported.
- Not all US cities use Socrata — some municipalities use custom permit portals, Tyler Munis, or Accela. This Actor covers Socrata-hosted datasets only.
❓ FAQ
How do I find a city's Socrata dataset ID?
Navigate to the city's open-data portal (e.g. data.cityofchicago.org), search "building permits", and open the dataset. The four-by-four ID (e.g. ydr8-5enu) appears in the URL after /resource/.
Can I scrape multiple cities at once?
Yes — add multiple objects to the datasets array. The Actor fans out and writes all results into a single dataset, tagged with source_domain and source_label so you can filter by city downstream.
Does this require a Socrata API key? No. All supported portals serve permit data without authentication on the public SODA endpoint.
Can I filter by permit type or contractor?
Use the soqlWhere field with a SoQL expression, e.g. permit_type='PERMIT - NEW CONSTRUCTION' or contractor_name IS NOT NULL. The SoQL syntax is documented at dev.socrata.com.
What if a column I need isn't in the output fields?
It will be in the extra dict. Every raw field the portal returns is preserved there — nothing is dropped.
📬 Your feedback
Found a city whose permit schema we're not mapping well, or a Socrata portal that isn't working? Open an issue or leave a review on the Apify Store listing. We read every message and typically ship fixes within a week.
Changelog
- 0.0.1 — initial scaffold (boot test only; real Socrata fetch coming in 0.1.0)