data.gouv.fr Scraper
Pricing
from $3.00 / 1,000 results
data.gouv.fr Scraper
Scrape the French government open-data portal (data.gouv.fr). Search datasets by keyword, fetch full dataset details by ID/slug, list datasets by organization, and search organizations and reuses - with titles, descriptions, resources/formats, licenses, temporal coverage and metrics.
Pricing
from $3.00 / 1,000 results
Rating
0.0
(0)
Developer
Crawler Bros
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Scrape data.gouv.fr, the official French government open-data portal — no account, no API key, no cookies. Search tens of thousands of public datasets, pull full dataset metadata, browse everything an organization publishes, and discover the reuses (apps, APIs, visualizations, articles) built on top of French public data. Fast, structured JSON straight from the public data.gouv.fr API.
Ideal for open-data research, data journalism, civic tech, competitive analysis, dataset discovery, and building data catalogs.
What this actor does
- Five modes:
searchDatasets,datasetDetails,byOrganization,searchOrganizations,searchReuses - Flexible lookup: search by keyword, fetch by dataset ID / slug / full URL, or list every dataset of an organization
- Reuse discovery: find apps, APIs, visualizations and articles, filterable by topic and type
- Rich metadata: resources & file formats, licenses, temporal coverage, tags, and engagement metrics (views, reuses, followers, downloads)
- Automatic pagination with cross-page de-duplication, so a request for N records returns N unique records
- Empty fields are omitted — you never get
nulls
Output per record
Every record is a flat JSON object. recordType, id, url, sourceUrl and scrapedAt appear on every record.
Dataset (searchDatasets, datasetDetails, byOrganization)
id,title,slug,descriptionlicense,frequency,accessTypequalityScore(0–1 metadata-quality score),featured(present when true),badges[](e.g.hvd,spd)organization,organizationId,organizationUrltemporalCoverageStart,temporalCoverageEnd,spatialGranularity,spatialZones[]tags[]createdAt,lastModified,lastUpdateresourceCount,formats[],resources[]— each resource hastitle,description,format,url,latestUrl,mime,type,fileType,filesize,checksum,createdAt,lastModifiedviews,followers,reuses,downloadsurl— portal page ·sourceUrl— API endpointrecordType: "dataset",scrapedAt
Organization (searchOrganizations)
id,name,slug,acronym,descriptionbusinessNumberId(SIREN),badges[]datasetsCount,reusesCount,dataservicesCount,followers,members,viewscreatedAt,lastModifiedlogoUrl,url,sourceUrlrecordType: "organization",scrapedAt
Reuse (searchReuses)
id,title,slug,descriptiontype,topicorganization,organizationIddatasetsCount,tags[]views,followerscreatedAt,lastModifiedimageUrl— reuse cover imageexternalUrl— link to the reuse itself ·url— portal page ·sourceUrl— API endpointrecordType: "reuse",scrapedAt
Sample dataset record
{"recordType": "dataset","id": "53ba5b91a3a729219b7beae9","title": "Transport","slug": "transport","description": "Réseau de transport en commun …","license": "cc-zero","frequency": "unknown","accessType": "open","qualityScore": 0.444,"organization": "Mairie de Monacia d'Aullène","organizationId": "5b9e...","organizationUrl": "https://www.data.gouv.fr/organizations/...","tags": ["transport", "mobilité"],"createdAt": "2014-07-07T09:00:00+00:00","lastUpdate": "2014-09-02T15:44:46.643000+00:00","resourceCount": 2,"formats": ["csv", "gtfs"],"resources": [{ "title": "Arrêts", "format": "csv", "url": "https://…/stops.csv", "latestUrl": "https://www.data.gouv.fr/api/1/datasets/r/…", "filesize": 102400, "checksum": "f0a7cc05…", "createdAt": "2014-07-07T10:34:25+00:00" }],"views": 3993,"downloads": 488,"url": "https://www.data.gouv.fr/datasets/transport","sourceUrl": "https://www.data.gouv.fr/api/1/datasets/transport/","scrapedAt": "2026-07-02T13:25:30+00:00"}
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | searchDatasets | searchDatasets / datasetDetails / byOrganization / searchOrganizations / searchReuses |
query | string | transport | Free-text keyword for the search modes (leave empty to browse all) |
datasetIds | array | – | Dataset IDs, slugs or full URLs (mode=datasetDetails) |
organization | string | – | Organization slug, ID or full URL (mode=byOrganization) |
topic | string | – | Reuse topic filter (mode=searchReuses) |
reuseType | string | – | Reuse type filter: API, Application, Visualization, … (mode=searchReuses) |
format | string | – | Only datasets offering this file format (CSV, JSON, GeoJSON, …) (dataset modes) |
license | string | – | Only datasets under this license (CC-BY, ODbL, Licence Ouverte, …) (dataset modes) |
badge | string | – | Only datasets with a quality badge: HVD (High Value Dataset) or SPD (dataset modes) |
tag | string | – | Only datasets/reuses carrying this exact tag |
featured | boolean | false | Only editorially featured datasets/reuses |
sort | string | -reuses | Newest, recently updated, most reused, most followed, or most viewed |
maxItems | integer | 50 | Hard cap on emitted records (1–2000) |
Example: search datasets by keyword
{"mode": "searchDatasets","query": "transport","sort": "-reuses","maxItems": 50}
Example: fetch specific datasets
{"mode": "datasetDetails","datasetIds": ["transport", "base-sirene-des-entreprises-et-de-leurs-etablissements-siren-siret"]}
Example: every dataset from an organization
{"mode": "byOrganization","organization": "institut-national-de-la-statistique-et-des-etudes-economiques-insee","sort": "-last_modified","maxItems": 100}
Example: discover transport apps (reuses)
{"mode": "searchReuses","topic": "transport_and_mobility","reuseType": "application","maxItems": 50}
Use cases
- Data catalogs — build a searchable catalog of French public datasets in a specific domain
- Organization monitoring — track everything a ministry or agency publishes
- Format discovery — find every dataset available as CSV, JSON, GeoJSON, …
- Civic tech — discover apps and visualizations reusing transport, health, or environment data
- Freshness tracking — monitor dataset update timestamps and coverage periods
- Data journalism — surface newly published or most-reused open datasets
FAQ
Do I need an API key or account? No. The data.gouv.fr API is fully public and free.
Can I get every dataset from a specific ministry or agency? Yes — use Datasets by organization mode with the organization's slug or ID (a full profile URL also works).
Can I filter datasets by file format?
Yes. Set the format input (e.g. csv, json, geojson, gtfs) to only return datasets that publish a resource in that format. You can further narrow results with license, badge (HVD / SPD), tag and featured. Every dataset record also lists its available formats.
What are reuses? Reuses are the apps, APIs, visualizations and articles that people have built on top of datasets. Search them by keyword, topic and type.
Which sort options apply to which mode? Datasets and organizations support all sort options. Reuses can be sorted by newest, recently updated, most followed or most viewed; "most reused" doesn't apply to reuses and falls back to the default relevance order.
Why do I sometimes get slightly fewer records than the reported total? The portal's relevance ordering can surface the same record on more than one page. The actor removes those duplicates, so you always get unique records.
Is the content in French? Yes — data.gouv.fr is the French national portal, so titles and descriptions are primarily in French. All text is returned as proper UTF-8.
How many results can I get?
Set maxItems up to 2000 per run. The actor paginates automatically until it reaches maxItems or the matching results are exhausted.
Data source
Data comes from the public data.gouv.fr API, operated by the French government (Etalab / DINUM). All content is open data published under open licenses. This is a third-party actor and is not affiliated with data.gouv.fr, Etalab or DINUM.