Los Angeles Open Data Scraper avatar

Los Angeles Open Data Scraper

Pricing

from $26.02 / 1,000 results

Go to Apify Store
Los Angeles Open Data Scraper

Los Angeles Open Data Scraper

Scrape any Los Angeles Open Data dataset via Socrata SODA API. Crime, business taxes, building permits, parking, 311 service requests and more. No API key required.

Pricing

from $26.02 / 1,000 results

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

ParseForge Banner

🌴 Los Angeles Open Data Scraper

πŸš€ Export any Los Angeles Open Data dataset in seconds. Tap 361 published datasets including crime data, building and safety permits, business registrations, MyLA311 service requests, parking citations, traffic collisions, payroll, and more, via the official Socrata SODA API. No API key, no registration.

πŸ•’ Last updated: 2026-05-13 Β· πŸ“Š Native dataset schema per record Β· πŸ—‚οΈ 361 datasets Β· 🌴 City of Los Angeles Β· πŸ”Œ Socrata SODA API

The Los Angeles Open Data Scraper is a universal export tool for every dataset on data.lacity.org. The City of Los Angeles publishes 361 datasets covering public safety, transportation, finance, planning, economic development, and recreation. This Actor lets you pull any of them by passing the Socrata 4x4 dataset ID, optionally adding SoQL filters ($where, $select, $order, $q), and downloading the result as CSV, Excel, JSON, or XML.

The catalog spans every major LA civic data set, including crime data from 2020 to present (2nrs-mtv8), the 2010-2019 crime archive (63jg-8b9z), building and safety permits (nbyu-2ha9), MyLA311 service requests (7my7-7vrt), parking citations (pvwu-3di3), traffic collisions (6rrh-rzua), active business registrations (nxs9-385f), city payroll, and many more. Output preserves the dataset's native schema and appends three metadata fields: _datasetId, _datasetUrl, and _scrapedAt.

🎯 Target AudienceπŸ’‘ Primary Use Cases
Civic researchers, journalists, prop-tech startups, GIS engineers, data scientists, public safety analysts, real-estate firms, urban planners, studentsCivic dashboards, FOIA-style export, permit/business feeds, crime and 311 monitoring, journalism investigations, ML training data on municipal events

πŸ“‹ What the LA Open Data Scraper does

Four filtering knobs map straight to Socrata SoQL:

  • πŸ†” Dataset selector. Pick any of 361 datasets by 4x4 ID. Find IDs in the URL of any dataset page on data.lacity.org.
  • πŸ” WHERE clause. Standard SoQL $where, e.g. crm_cd_desc='BURGLARY' AND date_occ>'2026-01-01'.
  • πŸ“‹ SELECT clause. Limit returned columns via $select.
  • πŸ“ˆ ORDER clause. Sort with $order, e.g. date_occ DESC.
  • πŸ”Ž Full-text search. Free-text $q across all string columns.

Each record returns the dataset's native columns verbatim (with Socrata's internal :@computed_region_* lookup columns stripped to keep the output clean), plus three appended metadata fields: _datasetId, _datasetUrl, and _scrapedAt. Pagination is automatic and capped at 1,000,000 rows.

πŸ’‘ Why it matters: Los Angeles publishes one of the largest municipal open-data catalogs in the country, but the SODA API has its own query language, paging quirks, and computed-region noise. This Actor turns that into a clean, paginated export with no Socrata code on your side.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded LA dataset.


βš™οΈ Input

InputTypeDefaultBehavior
datasetIdenum (4x4)"vygi-vxyg"Socrata 4x4 ID. Required. Enumerates all 361 datasets published on data.lacity.org.
maxItemsinteger10Records to return. Free plan caps at 10, paid plan at 1,000,000.
wherestring (SoQL)""Socrata $where filter.
selectstring (SoQL)""Comma-separated columns to return.
orderstring (SoQL)""Sort, e.g. date_occ DESC.
querystring""Free-text full-text search (Socrata $q).

Example: 500 most recent burglaries in 2026.

{
"datasetId": "2nrs-mtv8",
"maxItems": 500,
"where": "crm_cd_desc='BURGLARY' AND date_occ>'2026-01-01'",
"order": "date_occ DESC"
}

Example: every building & safety permit issued for new construction in Hollywood.

{
"datasetId": "nbyu-2ha9",
"maxItems": 1000,
"query": "Hollywood",
"where": "permit_type='Bldg-New'"
}

⚠️ Good to Know: the input dataset list contains all 361 datasets currently exposed on data.lacity.org. A small number are private (require Socrata authentication) and will return an HTTP 401 / 403 error record. Browse the full catalog and find the right 4x4 ID at data.lacity.org.


πŸ“Š Output

Each record returns the dataset's native schema verbatim (Socrata internal :@computed_region_* columns are stripped) plus three metadata fields. Download as CSV, Excel, JSON, or XML.

🧾 Schema (illustrative for crime dataset 2nrs-mtv8)

FieldTypeExample
πŸ†” dr_nostring"211507896"
πŸ“… date_rptdISO 8601"2021-04-11T00:00:00.000"
πŸ“… date_occISO 8601"2020-11-07T00:00:00.000"
πŸ•’ time_occstring"0845"
πŸš“ area / area_namestring"15" / "N Hollywood"
πŸ”’ crm_cd / crm_cd_descstring"354" / "THEFT OF IDENTITY"
πŸ‘€ vict_age / vict_sex / vict_descentstring"31" / "M" / "H"
🏠 premis_descstring"SINGLE FAMILY DWELLING"
πŸ“‹ status_descstring"Invest Cont"
πŸ“ locationstring"7800 BEEMAN AV"
πŸ“ lat / lonstring"34.2124" / "-118.4092"
πŸ†” _datasetIdstring"2nrs-mtv8"
πŸ”— _datasetUrlstring"https://data.lacity.org/d/2nrs-mtv8"
πŸ•’ _scrapedAtISO 8601"2026-05-13T10:00:00.000Z"

Every dataset has its own column set. The Actor passes through whatever Socrata returns for the dataset you picked.

πŸ“¦ Sample record (crime data)


✨ Why choose this Actor

Capability
πŸ—‚οΈ361 datasets, one Actor. Every public dataset on data.lacity.org enumerated in the input schema.
πŸ”Full SoQL filtering. $where, $select, $order, $q exposed as input fields.
🧹Cleaned output. Socrata :@computed_region_* internal columns stripped automatically.
πŸ”—Dataset provenance. Every record stamped with _datasetId, _datasetUrl, _scrapedAt.
⚑Fast. 1,000-row pages, automatic pagination up to 1,000,000 rows.
🚫No API key. The Socrata SODA API is public and unauthenticated for all public datasets.

πŸ“Š LA's open-data catalog is one of the most active public-sector data publishers in the country, powering everything from civic-tech projects to academic research.


πŸ“ˆ How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ LA Open Data Scraper (this Actor)$5 free credit, then pay-per-useAll 361 LA datasetsLive per runfull SoQL ($where, $select, $order, $q)⚑ 2 min
Manual CSV download from data.lacity.orgFreeOne dataset at a timeSnapshotNone🐒 Manual
Raw Socrata SODA queriesFreeFullLiveSoQLπŸ› οΈ Code required
Third-party civic-data aggregators$99+/monthMixedDailyVendor-defined⏳ Hours

Pick this Actor when you want a clean, filtered export of any LA dataset without writing a single line of Socrata code.


πŸš€ How to use

  1. πŸ“ Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Los Angeles Open Data Scraper page on the Apify Store.
  3. 🎯 Pick a dataset. Find the 4x4 ID on data.lacity.org (it's in every dataset URL) and paste it in.
  4. πŸ” Add optional filters. Type a SoQL $where, $order, $select, or full-text $q if you want a slice.
  5. πŸš€ Run it. Click Start and let the Actor collect your data.
  6. πŸ“₯ Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


πŸ’Ό Business use cases

🏒 Real Estate and Construction

  • Track every LA Department of Building and Safety permit
  • Monitor business registrations by SIC code and ZIP
  • Comparable cost analysis for development bids
  • Tenant-mix research with active business filings

πŸš“ Public Safety and Insurance

  • Build crime-density dashboards by LAPD area
  • Underwrite policies with live incident data
  • Risk-score parcels with traffic-collision history
  • Track MyLA311 service-request volume per district

πŸš— Mobility and Transportation

  • Parking-citation density for curbside-pricing models
  • Traffic-collision hot spots for route safety
  • Permit data for film-shoot road closures
  • DOT and street services intelligence

πŸ—žοΈ Journalism and Civic Tech

  • Investigate crime trends and reporting gaps
  • Quantify business-formation trends year over year
  • Build live-updating civic dashboards
  • Power newsroom data-explainer features

πŸ”Œ Automating LA Open Data Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟒 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • πŸ“š See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly, daily, or weekly refreshes keep downstream databases in sync automatically.


🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

πŸŽ“ Research and academia

  • Urban-studies papers on policing, transit, displacement
  • Public-health theses with MyLA311 and collision data
  • Reproducible policy-impact studies with versioned pulls
  • GIS coursework on real municipal datasets

🎨 Personal and creative

  • Neighborhood dashboards for your LAPD division
  • Side projects mapping every film shoot in LA
  • Civic-art and visualization exhibitions
  • Hobby trackers for permit pipeline or 311 timing

🀝 Non-profit and civic

  • Public-safety reform orgs analyzing arrest data
  • Mutual-aid networks monitoring MyLA311 categories
  • Civic-tech hackathons with structured datasets
  • Investigative journalism on city-government performance

πŸ§ͺ Experimentation

  • Train classification ML models on 311 narratives
  • Prototype agent pipelines that summarize city activity
  • Test geocoding and address-normalization toolchains
  • Validate civic-tech product hypotheses with live data

πŸ€– Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Paste the Socrata 4x4 ID of any LA dataset, optionally add SoQL filters and maxItems, click Start, and the Actor pages through the SODA API and emits the records verbatim with three appended metadata fields. No browser automation, no captchas, no setup.

πŸ†” How do I find a dataset ID?

Browse the catalog at data.lacity.org. Every dataset URL ends in a 4x4 ID like 2nrs-mtv8 (crime 2020-present) or 7my7-7vrt (MyLA311). Paste that ID into the input form.

πŸ—‚οΈ How many datasets are supported?

All 361 datasets currently exposed on data.lacity.org are enumerated in the input dropdown. New datasets are added by the City regularly; reach out if you need a specific one that isn't yet in the list.

πŸ” What is SoQL?

SoQL is Socrata's SQL-like query language for the SODA API. The Actor exposes $where, $select, $order, and $q as input fields. Reference docs: dev.socrata.com. A short cheat sheet: $where=col='value', $order=col DESC, $select=col1,col2, $q=search text.

🧹 Why are some columns missing from the output?

Socrata appends internal :@computed_region_* lookup columns to most datasets. These are noise for downstream analytics, so the Actor strips them automatically. Everything else in the dataset's native schema is passed through verbatim.

πŸ”„ How fresh is the data?

The City of Los Angeles updates each dataset on its own cadence (some daily, some weekly, some monthly). Every run of this Actor fetches the latest data available on data.lacity.org as of run time.

🚫 Why did I get a 401 or 403 error?

A small number of datasets are private and require Socrata authentication. The Actor will return a clean {error: ...} record indicating which one. Public datasets work without any credentials.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval (hourly, daily, weekly) and keep a downstream database in sync.

πŸ’³ Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you access to scheduling, higher concurrency, and larger datasets.

Yes. LA Open Data is published under the City of Los Angeles Open Data Policy and is generally free to reuse with attribution. Specific datasets may carry additional notes on their landing page; check before commercial redistribution.

πŸ†˜ What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.


πŸ”Œ Integrate with any app

LA Open Data Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get notified when a new record matches your filters
  • Airbyte - Pipe LA datasets into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push fresh LA civic data into your CRM or analytics backend.


πŸ’‘ Pro Tip: browse the complete ParseForge collection for more public-data scrapers.


πŸ†˜ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the City of Los Angeles or Tyler Technologies / Socrata. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.