OpenDataSoft Dataset Catalog Scraper
Pricing
from $3.00 / 1,000 results
OpenDataSoft Dataset Catalog Scraper
Discover and browse 400+ public open datasets from the OpenDataSoft platform. Search by keyword, filter by theme or record count, and export dataset metadata or records.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Crawler Bros
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Browse and download from 400+ public open datasets hosted on the OpenDataSoft platform — covering transportation, environment, health, tourism, urban data, and more. No API key or account required.
What it does
Scrapes the OpenDataSoft public catalog (public.opendatasoft.com), a curated collection of open government and community datasets. The scraper operates in two modes:
- catalog — discover datasets by keyword or theme
- records — export actual data rows from any specific dataset
Input
| Field | Type | Description | Default |
|---|---|---|---|
mode | string | catalog (browse datasets) or records (export rows from a dataset) | catalog |
query | string | Search datasets by keyword (title, description, publisher) | — |
theme | string | Filter by theme (e.g. Environment, Transport, Health) | — |
datasetId | string | Dataset ID for mode=records (e.g. osm-australia-cinema) | — |
recordsFilter | string | ODS filter expression for records (e.g. country=Australia) | — |
maxItems | integer | Maximum number of items to emit (1–5,000) | 100 |
Modes
catalog — Searches the OpenDataSoft public dataset registry. Returns dataset metadata including title, publisher, record count, themes, keywords, license, and update frequency.
records — Fetches actual data rows from a specified dataset. Use the catalog mode first to find dataset IDs, then switch to records mode to extract the data.
Output
Catalog mode output
{"datasetId": "osm-australia-cinema","datasetUid": "da-xyz789","title": "Cinemas - Australia - OSM data","description": "Cinema locations from OpenStreetMap data for Australia.","publisher": "OpenDataSoft","recordsCount": 408,"themes": ["Leisure & Tourism"],"keywords": ["cinema", "entertainment", "osm"],"language": "en","license": "Open Database License","licenseUrl": "https://opendatacommons.org/licenses/odbl/","modifiedAt": "2024-01-15","updateFrequency": "weekly","features": ["geo"],"hasRecords": true,"dataVisible": true,"recordType": "dataset","scrapedAt": "2024-01-15T14:30:00+00:00"}
Records mode output
{"datasetId": "osm-australia-cinema","recordType": "record","name": "United Cinemas Opera Quays","opening_hours": "Mo-Su 10:00-23:00","wheelchair": "yes","meta_name_sub": "Sydney","meta_name_state": "New South Wales","meta_geo_pointLat": -33.859496,"meta_geo_pointLon": 151.213046,"scrapedAt": "2024-01-15T14:30:00+00:00"}
Output Fields (catalog)
| Field | Type | Description |
|---|---|---|
datasetId | string | Unique dataset identifier |
title | string | Human-readable dataset title |
description | string | Dataset description |
publisher | string | Organization that published the dataset |
recordsCount | integer | Number of data rows in the dataset |
themes | array | Thematic categories (e.g. Transport, Environment) |
keywords | array | Keywords describing the dataset |
language | string | Primary language |
license | string | Data usage license |
licenseUrl | string | Link to license document |
modifiedAt | string | Last modification date |
updateFrequency | string | How often the dataset updates |
features | array | Dataset capabilities (e.g. geo, timeserie) |
hasRecords | boolean | Whether the dataset has accessible records |
dataVisible | boolean | Whether data is publicly visible |
recordType | string | Always dataset in catalog mode |
scrapedAt | string | ISO 8601 UTC timestamp |
All fields are omit-empty — null or empty fields are excluded from output.
Use Cases
- Data discovery — Find relevant public datasets for research or integration projects.
- Urban analytics — Explore transportation, parking, cycling, or POI datasets.
- Environmental research — Discover air quality, water, and nature datasets.
- Government data tracking — Monitor what open data is available from public institutions.
- Dataset inventory — Build a searchable catalog of available open data sources.
Example Inputs
Find transport datasets
{ "mode": "catalog", "query": "transport", "maxItems": 50 }
Find environment datasets
{ "mode": "catalog", "theme": "Environment", "maxItems": 100 }
Export records from a cinema dataset
{"mode": "records","datasetId": "osm-australia-cinema","maxItems": 500}
Data Source
Data is from the OpenDataSoft public platform (public.opendatasoft.com), a curated catalog of open datasets contributed by governments, municipalities, and research institutions. Access is free and requires no authentication.
FAQs
Do I need an API key?
No. The OpenDataSoft public platform (public.opendatasoft.com) is freely accessible with no credentials required.
How many datasets are available? Approximately 400+ public datasets as of 2024, covering topics from OSM point-of-interest data to electoral results and urban infrastructure.
How do I find a dataset ID?
Run the scraper in catalog mode with a keyword search. The datasetId field in the output is what you pass to records mode.
Can I filter records by specific field values?
Yes, use the recordsFilter field in records mode. For example: name="Paris" or type="restaurant". This uses the OpenDataSoft filter expression syntax.
Why do some records have geo coordinates?
Datasets with geographic data include a meta_geo_point field that the scraper automatically flattens into meta_geo_pointLat and meta_geo_pointLon fields.
What is the difference between catalog and records mode? Catalog mode returns one row per dataset (metadata only). Records mode returns one row per data item within a specific dataset. Use catalog to discover datasets, then records to extract their data.