Canada Open Data Catalog Scraper avatar

Canada Open Data Catalog Scraper

Pricing

from $15.00 / 1,000 result items

Go to Apify Store
Canada Open Data Catalog Scraper

Canada Open Data Catalog Scraper

Export Canadian government open datasets from open.canada.ca. Browse 36k+ datasets across federal departments. Pull dataset metadata, resources, organization, license, tags, and publication dates. Catalog mode lists all datasets; dataset mode fetches one by ID.

Pricing

from $15.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

17 hours ago

Last modified

Share

ParseForge Banner

🍁 Canada Open Data Catalog Scraper

🚀 Export the Canadian federal open data catalog in seconds. Pull 36,000+ datasets across every federal department with metadata, downloadable resources, organization, license, and update history. No API key, no scraping pipeline, no manual CSV stitching.

🕒 Last updated: 2026-05-22 · 📊 13 fields per record · 🍁 36,000+ datasets · 🏛️ 100+ federal organizations · 🇨🇦 English + French metadata

The Canada Open Data Scraper taps the official open.canada.ca catalog and returns 13 structured fields per dataset, including title, description, organization, license, file resources (CSV, GeoJSON, XLSX, ZIP, etc.), tags, and creation/modification timestamps. The underlying portal is the federal government's flagship open data initiative and aggregates publications from Statistics Canada, Health Canada, Environment Canada, Transport Canada, Natural Resources Canada, and 100+ other departments and agencies.

The catalog covers federal datasets across every Canadian ministry, with bilingual English and French metadata, downloadable file resources, and consistent licensing fields. This Actor turns it into a clean dataset downloadable as CSV, Excel, JSON, or XML in under a minute. Filters and pagination run server-side, so you skip the parser engineering entirely.

🎯 Target Audience💡 Primary Use Cases
Canadian SMEs, civic-tech builders, researchers, journalists, GIS engineers, policy analysts, NGOsOpen-data discovery, federal dataset inventory, license auditing, journalism research, GIS source feeds, ML training data

📋 What the Canada Open Data Scraper does

Two collection workflows in a single run:

  • 🍁 Catalog walk. Paginate the full federal dataset registry, optionally filtered by a search query.
  • 🎯 Single dataset fetch. Retrieve one dataset by UUID or slug for targeted enrichment.
  • 📁 Resource expansion. Every dataset includes its file resources with name, format, URL, language, size, and last-modified timestamp.
  • 🏷️ Tag and keyword merge. English and French tags consolidated into one deduplicated list.
  • 🏛️ Organization attribution. Publishing department or agency captured for every record.

Each record includes identifiers (UUID, slug), descriptive metadata (title, notes, tags), governance (organization, license, frequency), and a full list of downloadable resources.

💡 Why it matters: Canada's open data portal is a goldmine of free, high-trust data, but the portal UI is built for browsing one dataset at a time. This Actor turns it into a single downloadable spreadsheet so you can audit license terms, sort by recency, or build a research inventory in minutes.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


⚙️ Input

InputTypeDefaultBehavior
maxItemsinteger10Records to return. Free plan caps at 10, paid plan at 1,000,000.
modeenum"catalog"catalog walks the registry, dataset fetches one record by ID.
datasetIdstringemptyUUID or slug of a single dataset. Required in dataset mode.
searchQuerystringemptyOptional keyword filter applied in catalog mode.

Example: catalog walk of 50 datasets matching "climate".

{
"maxItems": 50,
"mode": "catalog",
"searchQuery": "climate"
}

Example: fetch one dataset by slug.

{
"mode": "dataset",
"datasetId": "statcan-pumf-census-2021"
}

⚠️ Good to Know: dataset frequency, license, and tag completeness vary by publishing department. Statistics Canada datasets are richly tagged; older legacy datasets from smaller agencies sometimes leave tags or update frequency blank. The Actor always returns the official portal URL so you can cross-check anything ambiguous.


📊 Output

Each dataset record contains up to 13 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
🆔 idstring"a8db9ea0-7e1b-4cce-a4f3-8e7e6e58f6d5"
🏷️ namestring"statcan-pumf-census-2021"
📰 titlestring"Census of Population, 2021"
📝 notesstring"The Public Use Microdata File..."
🏛️ organizationstring"Statistics Canada"
⚖️ licensestring"Open Government Licence - Canada"
🏷️ tagsarray["census","demographics","population"]
📁 resourcesCountnumber4
📦 resourcesarray[{name, format, url, language, size, lastModified}]
🕒 metadataCreatedISO 8601"2022-07-13T13:01:22.547Z"
🔁 metadataModifiedISO 8601"2024-11-04T09:21:18.000Z"
📅 frequencystring | null"annually"
🔗 portalUrlstring"https://open.canada.ca/data/en/dataset/..."
🕒 scrapedAtISO 8601"2026-05-22T00:00:00.000Z"

📦 Sample records


✨ Why choose this Actor

Capability
🍁Full federal coverage. Every department on open.canada.ca is in scope, from Statistics Canada to Parks Canada.
🎯Two-mode workflow. Bulk catalog walk for inventory, single-dataset fetch for enrichment.
📁Resource expansion. Each record carries its downloadable files (CSV, GeoJSON, ZIP, etc.) with size, language, and last-modified date.
🏷️Bilingual tag merge. English and French keywords consolidated into one deduplicated list.
Fast. 100 datasets in under a minute, 10,000 in under 15 minutes.
🔁Always fresh. Every run pulls the latest catalog state, no caching.
🚫No authentication. Works on public open government data. No API key needed.

📊 The Canadian federal open data portal is the most comprehensive single source of free Canadian government data. This Actor makes the entire catalog queryable in minutes.


📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
⭐ Canada Open Data Scraper (this Actor)$5 free credit, then pay-per-use36,000+ federal datasetsLive per runSearch keyword, single ID⚡ 2 min
Hand-built CKAN clientFreeUp to youManualUp to you🐢 Days
Commercial data marketplaces$$$Curated subsetVariableVendor-defined⏳ Hours
Manual portal browsingFreeOne at a timeManualUI search🕒 Endless

Pick this Actor when you want a clean structured dataset of the entire Canadian federal open data registry without writing a CKAN client.


🚀 How to use

  1. 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
  2. 🌐 Open the Actor. Go to the Canada Open Data Catalog Scraper page on the Apify Store.
  3. 🎯 Set input. Pick catalog mode for a full walk or dataset mode with an ID, plus maxItems.
  4. 🚀 Run it. Click Start and let the Actor collect your data.
  5. 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.


💼 Business use cases

📊 Market & Industry Research

  • Map publicly available indicators by sector
  • Build sector reports with cited Canadian sources
  • Identify gaps in published government data
  • Benchmark Canadian SME and industry coverage

🗺️ GIS, Mapping & Environment

  • Inventory GeoJSON, Shapefile, KML resources
  • Pull environmental and climate file feeds
  • Layer federal land-use data into maps
  • Source baseline maps for impact assessments

📰 Journalism & Investigations

  • Audit which departments publish what
  • Track dataset additions and refresh cadence
  • Discover lesser-known transparency releases
  • Surface licensing terms across ministries

🏛️ Policy & Advocacy

  • Aggregate evidence from multiple departments
  • Build cross-ministry briefings with citations
  • Track open-data publishing commitments
  • Pull baseline figures for policy proposals

🔌 Automating Canada Open Data Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

  • 🟢 Node.js. Install the apify-client NPM package.
  • 🐍 Python. Use the apify-client PyPI package.
  • 📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Weekly or monthly refreshes keep downstream catalogs in sync automatically.


🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Public-policy theses with reproducible data citations
  • Open-data scholarship and dataset inventories
  • Cross-country open-government comparisons
  • Course exercises on federal data discovery

🎨 Personal and creative

  • Dataviz portfolios sourced from Canadian data
  • Indie civic dashboards and search interfaces
  • Hobby projects on local geography or transit
  • Content research for explainer blogs

🤝 Non-profit and civic

  • NGO research on social and environmental files
  • Civic-tech open-data discovery tools
  • Investigative journalism around federal releases
  • Community map layers from official sources

🧪 Experimentation

  • Train ML models on federally published datasets
  • Prototype agents that resolve dataset slugs
  • Validate civic-tech product ideas with real metadata
  • Build dataset search experiences over open feeds

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

🧩 How does it work?

Configure your mode (catalog walk or single dataset) and optional keyword filter, click Start, and the Actor paginates the open.canada.ca catalog server-side and emits one clean structured record per dataset.

🍁 How many datasets are in the catalog?

More than 36,000 federal datasets, including bilingual metadata in English and French. The exact count changes as departments publish, update, or retire records.

🔁 How often is the catalog refreshed?

Departments publish on their own schedules. Every run of this Actor fetches the latest catalog state in real time, so your dataset always reflects the current portal.

You get the resource metadata for every file (name, format, URL, language, size, last-modified). Download the files themselves directly from the URLs in the resources field.

⏰ Can I schedule regular runs?

Yes. Use Apify Schedules to run this Actor on any cron interval (daily, weekly, monthly) and keep a downstream dataset inventory in sync.

The Canadian federal open data portal publishes under the Open Government Licence - Canada, which permits commercial use with attribution. Always review the per-dataset license field returned by this Actor.

💼 Can I use this data commercially?

Yes. The default license is permissive for commercial use. A small number of legacy datasets carry custom terms, which you can read in the license field for each record.

💳 Do I need a paid Apify plan to use this Actor?

No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you access to scheduling, higher concurrency, and larger datasets.

🔁 What happens if a run fails or gets interrupted?

Apify automatically retries transient errors. If a run still fails, you can inspect the log in the Runs tab, fix the input, and re-run. Partial datasets from failed runs are preserved so you never lose progress.

🇫🇷 Does it include French-language metadata?

Yes. Tags and keywords are deduplicated across English and French. Resource language is captured per file when the publishing department flagged it.

🆘 What if I need help?

Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.


🔌 Integrate with any app

Canada Open Data Scraper connects to any cloud service via Apify integrations:

  • Make - Automate multi-step workflows
  • Zapier - Connect with 5,000+ apps
  • Slack - Get run notifications in your channels
  • Airbyte - Pipe dataset metadata into your warehouse
  • GitHub - Trigger runs from commits and releases
  • Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes. Push new datasets into your product backend, or alert your team in Slack.


💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.


🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.


⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Government of Canada or any federal department or agency. All trademarks mentioned are the property of their respective owners. Only publicly available open government data is collected.