US Building Permit Leads Scraper (Socrata) avatar

US Building Permit Leads Scraper (Socrata)

Pricing

Pay per event

Go to Apify Store
US Building Permit Leads Scraper (Socrata)

US Building Permit Leads Scraper (Socrata)

Pull fresh building and construction permit records from any US city or county Socrata open-data portal. We normalize permit ID, type, status, address, valuation, and contact info into one typed row per permit — ready to feed your contractor CRM or lead pipeline.

Pricing

Pay per event

Rating

0.0

(0)

Developer

DevilScrapes

DevilScrapes

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share


🎯 What this scrapes

US municipal and county building-permit registries published via the Socrata SODA API (over 300 US portals at last count, including Chicago, New York, Seattle, Austin, Los Angeles, and dozens of smaller cities). The SODA API returns permit records as structured JSON rows — this Actor normalises the common fields that matter for contractor lead-gen and passes through every city-specific field in an extra dict, so you never lose data even when the schemas diverge.

Verified against: Chicago (data.cityofchicago.org / ydr8-5enu), with the same logic applicable to any Socrata-hosted permit dataset.

🔥 What we handle for you

  • 🛡️ Proxy rotation via Apify Proxy — we absorb transient 429s and routing failures so your run completes even when a portal throttles a single IP.
  • 🔁 Retries with exponential backoff on 429 / 503 and network errors — up to 5 attempts per request, Retry-After header honoured.
  • 🧊 Normalized, typed dataset rows — Pydantic-validated, ISO-8601 dates, stable field names across all city schemas, JSON / CSV / Excel export from the Apify Console.
  • 🗺️ Multi-city in a single run — pass an array of {domain, datasetId} objects and we fan out across all portals, respecting maxItemsPerDataset per source.
  • 💰 Pay-Per-Event pricing — you pay only for permit records that land in your dataset. No data, no charge (beyond the small actor-start fee).

💡 Use cases

  • HVAC / roofing / solar contractor lead lists — filter by permit type (new construction, re-roof, HVAC installation) and harvest the applicant or contractor contact frame for outbound sales.
  • Contractor-lead resellers — run nightly on multiple city datasets, normalize to a single schema, and deliver fresh permit leads to clients via the Apify API or webhook.
  • Valuation monitoring — pull permits above a dollar threshold in a target ZIP code to watch development activity in a neighborhood.
  • Construction analytics — aggregate permit counts by type, date, and district to report on building-activity trends for real-estate investors or municipalities.
  • CRM enrichment pipeline — match permit applicant names or addresses against your existing contact records to surface warm leads before the competition does.

⚙️ How to use it

  1. Click Try for free at the top of the page.
  2. In the Socrata datasets field, enter the list of portal + dataset pairs you want to scrape. The default pre-fills Chicago building permits as a working example.
  3. Optionally set Issued after to a date (e.g. 2024-01-01) to only pull recent permits.
  4. Set Max items per dataset1000 is the default; use 0 for all available records (very large datasets can run for several minutes).
  5. Click Start. Results stream into the run's dataset in real time.
  6. Export from Storage → Dataset as JSON, CSV, or Excel — or call the dataset via the Apify API.

Finding a new city's dataset ID: go to data.cityofchicago.org (or your target city's Socrata domain), search for "building permits", open the dataset, and grab the four-by-four ID from the URL (e.g. ydr8-5enu).

📥 Input

FieldTypeRequiredDefaultNotes
datasetsarrayyesChicago exampleList of {domain, datasetId, label?} objects
issuedAfterstringno""ISO-8601 date floor on issue_date
soqlWherestringno""Raw SoQL $where override (overrides issuedAfter)
maxItemsPerDatasetintegerno1000Per-dataset record cap; 0 = unlimited
proxyConfigurationobjectnoApify Proxy enabledProxy settings — Apify Proxy by default

Example input

{
"datasets": [
{
"domain": "data.cityofchicago.org",
"datasetId": "ydr8-5enu",
"label": "Chicago"
}
],
"issuedAfter": "2024-01-01",
"maxItemsPerDataset": 1000,
"proxyConfiguration": {
"useApifyProxy": true
}
}

📤 Output

Each row is one building permit record. Common fields are normalized; city-specific extras land in extra.

FieldTypeNotes
source_domainstringPortal hostname
dataset_idstringSocrata dataset ID
source_labelstring|nullHuman label from input
permit_idstring|nullPermit number
permit_typestring|nulle.g. "PERMIT - NEW CONSTRUCTION"
statusstring|nulle.g. "permit issued"
work_descriptionstring|nullFree-text work description
issue_datestring|nullISO-8601 or raw string
application_datestring|nullApplication submission date
addressstring|nullBest-effort assembled full address
street_numberstring|nullStreet number
street_namestring|nullStreet name
citystring|nullCity name
statestring|nullTwo-letter state code
postal_codestring|nullZIP code
valuationfloat|nullProject valuation in USD
fee_paidfloat|nullBuilding fee paid in USD
contractor_namestring|nullLicensed contractor name
contractor_licensestring|nullContractor license number
applicant_namestring|nullApplicant name
latitudefloat|nullWGS-84 latitude
longitudefloat|nullWGS-84 longitude
extraobject|nullAll unmapped city-specific fields

Example output row

{
"source_domain": "data.cityofchicago.org",
"dataset_id": "ydr8-5enu",
"source_label": "Chicago",
"permit_id": "100987654",
"permit_type": "PERMIT - NEW CONSTRUCTION",
"status": "permit issued",
"work_description": "ERECT A 2-STORY SINGLE FAMILY RESIDENCE",
"issue_date": "2024-03-15",
"application_date": "2024-01-10",
"address": "1234 N MAIN ST, Chicago, IL 60614",
"street_number": "1234",
"street_name": "MAIN",
"city": "Chicago",
"state": "IL",
"postal_code": "60614",
"valuation": 350000.0,
"fee_paid": 1850.0,
"contractor_name": "ABC CONSTRUCTION LLC",
"contractor_license": "LIC-2024-001",
"applicant_name": "Jane Smith",
"latitude": 41.9021,
"longitude": -87.6346,
"extra": {
"review_type": "STANDARD PLAN EXAMINATION",
"reported_cost": "350000"
}
}

💰 Pricing

$2.00 per 1 000 permit records.

A typical city dataset has 50 000–500 000 permit rows. A run pulling 10 000 records costs roughly $0.20 — cheaper than manually downloading and normalizing a CSV. You pay only for records that land in your dataset; a failed or empty run charges only the small actor-start event.

EventCost
Actor start$0.03 (one-time per run)
Per result emitted$0.002

🚧 Limitations

  • Socrata schema divergence — every city publishes slightly different column names. The Actor maps the most common field names (Chicago, Seattle, LA, etc.) but may miss aliases in less common portals. Unknown columns land in extra.
  • Rate limits — Socrata portals are publicly throttled. Very large pulls (maxItemsPerDataset=0 on a 500k-row dataset) may be paced across multiple pages; expect runs of 5–15 minutes for large datasets.
  • Authentication not supported — all supported portals require no API key on the public endpoints. Portals behind Socrata auth tokens are not currently supported.
  • Not all US cities use Socrata — some municipalities use custom permit portals, Tyler Munis, or Accela. This Actor covers Socrata-hosted datasets only.

❓ FAQ

How do I find a city's Socrata dataset ID? Navigate to the city's open-data portal (e.g. data.cityofchicago.org), search "building permits", and open the dataset. The four-by-four ID (e.g. ydr8-5enu) appears in the URL after /resource/.

Can I scrape multiple cities at once? Yes — add multiple objects to the datasets array. The Actor fans out and writes all results into a single dataset, tagged with source_domain and source_label so you can filter by city downstream.

Does this require a Socrata API key? No. All supported portals serve permit data without authentication on the public SODA endpoint.

Can I filter by permit type or contractor? Use the soqlWhere field with a SoQL expression, e.g. permit_type='PERMIT - NEW CONSTRUCTION' or contractor_name IS NOT NULL. The SoQL syntax is documented at dev.socrata.com.

What if a column I need isn't in the output fields? It will be in the extra dict. Every raw field the portal returns is preserved there — nothing is dropped.

📬 Your feedback

Found a city whose permit schema we're not mapping well, or a Socrata portal that isn't working? Open an issue or leave a review on the Apify Store listing. We read every message and typically ship fixes within a week.


Changelog

  • 0.0.1 — initial scaffold (boot test only; real Socrata fetch coming in 0.1.0)