OpenDataSoft Dataset Catalog Scraper avatar

OpenDataSoft Dataset Catalog Scraper

Pricing

from $3.00 / 1,000 results

Go to Apify Store
OpenDataSoft Dataset Catalog Scraper

OpenDataSoft Dataset Catalog Scraper

Discover and browse 400+ public open datasets from the OpenDataSoft platform. Search by keyword, filter by theme or record count, and export dataset metadata or records.

Pricing

from $3.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Crawler Bros

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Browse and download from 400+ public open datasets hosted on the OpenDataSoft platform — covering transportation, environment, health, tourism, urban data, and more. No API key or account required.

What it does

Scrapes the OpenDataSoft public catalog (public.opendatasoft.com), a curated collection of open government and community datasets. The scraper operates in two modes:

  • catalog — discover datasets by keyword or theme
  • records — export actual data rows from any specific dataset

Input

FieldTypeDescriptionDefault
modestringcatalog (browse datasets) or records (export rows from a dataset)catalog
querystringSearch datasets by keyword (title, description, publisher)
themestringFilter by theme (e.g. Environment, Transport, Health)
datasetIdstringDataset ID for mode=records (e.g. osm-australia-cinema)
recordsFilterstringODS filter expression for records (e.g. country=Australia)
maxItemsintegerMaximum number of items to emit (1–5,000)100

Modes

catalog — Searches the OpenDataSoft public dataset registry. Returns dataset metadata including title, publisher, record count, themes, keywords, license, and update frequency.

records — Fetches actual data rows from a specified dataset. Use the catalog mode first to find dataset IDs, then switch to records mode to extract the data.

Output

Catalog mode output

{
"datasetId": "osm-australia-cinema",
"datasetUid": "da-xyz789",
"title": "Cinemas - Australia - OSM data",
"description": "Cinema locations from OpenStreetMap data for Australia.",
"publisher": "OpenDataSoft",
"recordsCount": 408,
"themes": ["Leisure & Tourism"],
"keywords": ["cinema", "entertainment", "osm"],
"language": "en",
"license": "Open Database License",
"licenseUrl": "https://opendatacommons.org/licenses/odbl/",
"modifiedAt": "2024-01-15",
"updateFrequency": "weekly",
"features": ["geo"],
"hasRecords": true,
"dataVisible": true,
"recordType": "dataset",
"scrapedAt": "2024-01-15T14:30:00+00:00"
}

Records mode output

{
"datasetId": "osm-australia-cinema",
"recordType": "record",
"name": "United Cinemas Opera Quays",
"opening_hours": "Mo-Su 10:00-23:00",
"wheelchair": "yes",
"meta_name_sub": "Sydney",
"meta_name_state": "New South Wales",
"meta_geo_pointLat": -33.859496,
"meta_geo_pointLon": 151.213046,
"scrapedAt": "2024-01-15T14:30:00+00:00"
}

Output Fields (catalog)

FieldTypeDescription
datasetIdstringUnique dataset identifier
titlestringHuman-readable dataset title
descriptionstringDataset description
publisherstringOrganization that published the dataset
recordsCountintegerNumber of data rows in the dataset
themesarrayThematic categories (e.g. Transport, Environment)
keywordsarrayKeywords describing the dataset
languagestringPrimary language
licensestringData usage license
licenseUrlstringLink to license document
modifiedAtstringLast modification date
updateFrequencystringHow often the dataset updates
featuresarrayDataset capabilities (e.g. geo, timeserie)
hasRecordsbooleanWhether the dataset has accessible records
dataVisiblebooleanWhether data is publicly visible
recordTypestringAlways dataset in catalog mode
scrapedAtstringISO 8601 UTC timestamp

All fields are omit-empty — null or empty fields are excluded from output.

Use Cases

  • Data discovery — Find relevant public datasets for research or integration projects.
  • Urban analytics — Explore transportation, parking, cycling, or POI datasets.
  • Environmental research — Discover air quality, water, and nature datasets.
  • Government data tracking — Monitor what open data is available from public institutions.
  • Dataset inventory — Build a searchable catalog of available open data sources.

Example Inputs

Find transport datasets

{ "mode": "catalog", "query": "transport", "maxItems": 50 }

Find environment datasets

{ "mode": "catalog", "theme": "Environment", "maxItems": 100 }

Export records from a cinema dataset

{
"mode": "records",
"datasetId": "osm-australia-cinema",
"maxItems": 500
}

Data Source

Data is from the OpenDataSoft public platform (public.opendatasoft.com), a curated catalog of open datasets contributed by governments, municipalities, and research institutions. Access is free and requires no authentication.

FAQs

Do I need an API key? No. The OpenDataSoft public platform (public.opendatasoft.com) is freely accessible with no credentials required.

How many datasets are available? Approximately 400+ public datasets as of 2024, covering topics from OSM point-of-interest data to electoral results and urban infrastructure.

How do I find a dataset ID? Run the scraper in catalog mode with a keyword search. The datasetId field in the output is what you pass to records mode.

Can I filter records by specific field values? Yes, use the recordsFilter field in records mode. For example: name="Paris" or type="restaurant". This uses the OpenDataSoft filter expression syntax.

Why do some records have geo coordinates? Datasets with geographic data include a meta_geo_point field that the scraper automatically flattens into meta_geo_pointLat and meta_geo_pointLon fields.

What is the difference between catalog and records mode? Catalog mode returns one row per dataset (metadata only). Records mode returns one row per data item within a specific dataset. Use catalog to discover datasets, then records to extract their data.