🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat avatar

🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat

Pricing

from $20.00 / 1,000 singapore data.gov.sg records

Go to Apify Store
🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat

🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat

Query any of 1000+ Singapore government open datasets (data.gov.sg) - transport, environment, economy, demographics, education, health, housing. One actor, one resource_id, official CKAN API.

Pricing

from $20.00 / 1,000 singapore data.gov.sg records

Rating

0.0

(0)

Developer

NexGenData

NexGenData

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

🇸🇬 Singapore data.gov.sg Multi-Dataset Scraper — HDB Resale, COE, Singstat, MAS

A universal wrapper for data.gov.sg — the official Singapore Government open-data platform. One actor, one input field (dataset_id), and you can pull from 1,000+ official datasets spanning transport, environment, economy, demographics, education, health, housing, and finance.

This is the fastest path to programmatic, paginated, schema-tagged access for journalists, researchers, policy analysts, fintech builders, prop-tech founders, and data scientists working with Singapore public data. The actor wraps the CKAN datastore_search API so you don't have to build pagination, filtering, full-text search, or rate-limit handling yourself.

No authentication. No API keys. No anti-bot risk. This is an official Government of Singapore API, free for public use under the Singapore Open Data Licence.


Why this beats the data.gov.sg portal, Singstat Table Builder, EDB & paid statistics vendors

  • The data.gov.sg portal is browse-only. It gives you a CSV download per dataset and a manual UI. No bulk pulls, no scheduled refresh, no programmatic filter, no pipeline-ready output. This actor fixes that.
  • Singstat Table Builder is a UI, not an API. You click through facets to compose a table, then export. There is no clean CKAN-style endpoint behind it. Our actor goes straight to the underlying CKAN datastore_search.
  • World Bank, Statista, Knoema, Trading Economics, CEIC all republish Singapore data behind $200–$5,000/month paywalls — the same data the Government of Singapore publishes free. You're paying them to copy a free firehose.
  • Pay-per-row, not per-month. $0.01 per run start + $0.02 per record. A 500-row HDB Tampines pull is $10.01. A 10,000-row environmental scan is $200.01. Compare to the $129–$799/month "Singapore data" SKUs from paid statistics vendors.
  • CKAN filtering, full-text search, and 1,000-row chunked pagination — all handled. Pass {"town": "TAMPINES"} to filter HDB resale. Pass search_query: "Bukit Timah" for free-text search across all columns. Set limit: 10000 to pull the full historical slice.

What You Get

Every record returned by the actor is flattened and tagged with provenance fields, then unioned with whatever the underlying dataset's native columns are.

FieldPurpose
dataset_idThe CKAN resource_id you queried (e.g. d_8b84c4ee58e3cfc0ece0d773c8ca6abc).
dataset_nameFriendly label stamped on every row (e.g. HDB Resale Prices 2017+).
categoryOne of Transport, Environment, Economy, Demographics, Education, Health, Housing, Finance, Other.
data_sourceAlways data.gov.sg (CKAN datastore_search) — preserved for attribution.
last_updatedProvenance timestamp from CKAN _links.start.
_idCKAN row primary key.
…native columns…Every column the underlying dataset publishes (e.g. for HDB resale: town, flat_type, block, street_name, storey_range, floor_area_sqm, flat_model, lease_commence_date, remaining_lease, resale_price, month).

The native columns vary per dataset — that's the trade-off of a universal wrapper. Run with limit: 5 first to inspect the exact shape before wiring a downstream pipeline.

  • Housing (HDB / URA) — HDB Resale Flat Prices (1990 → present, split across 5 vintage tables), HDB Property Info, HDB Carpark Info.
  • Transport (LTA) — COE Bidding Results, Annual Motor Vehicle Population, Monthly Motor Vehicle Registrations, Public Transport Ridership, Bus Stops, MRT/LRT Station Codes, ERP Rates.
  • Environment (NEA) — Monthly Rainfall, Daily Rainfall, Mean Air Temperature, Relative Humidity, Wind Speed, PSI (24-hour, by region), Dengue Cases, Beach Water Quality.
  • Economy & Finance (MAS, MTI, SingStat) — Consumer Price Index (CPI), GDP by Industry, Unemployment Rate, Foreign Reserves, SIBOR rates, MAS Exchange Rates (end-of-period), Total Trade Imports & Exports.
  • Demographics (SingStat, ICA, DOS) — Residents by Planning Area & Sex, Residents by Age Group, Births & Deaths, Marriages & Divorces, Population Trends.
  • Education (MOE) — General Information of Schools, Subjects Offered by Schools, CCAs Offered.
  • Health (MOH) — Weekly Infectious Disease Bulletin, Hospital Bed Capacity.

Browse the full catalogue at data.gov.sg/datasets and paste any resource_id into the actor's dataset_id field.


Use Cases

  • Prop-tech founders & PropertyGuru / 99.co competitors — pull the full HDB Resale Flat Prices history (230K+ rows) and join to your listings, compute $/sqft, lease-decay-adjusted price, and town-level price indices without scraping URA's PDF reports.
  • Singapore journalists & data desks (ST, CNA, Today, MOTHERSHIP) — paginate the COE bidding history, quarterly CPI prints, dengue weekly bulletins, or housing transaction records for stories on cost of living, transport policy, and public health.
  • Fintech & wealth-tech (StashAway, Endowus, Syfe, Saxo) — refresh MAS Exchange Rates, SIBOR, and CPI directly from the source feed for in-app charts and macro panels.
  • Academic researchers & think-tanks (IPS, LKYSPP, NUS, NTU, SUTD) — reproducible, dated, version-controlled pulls for working papers on housing affordability, transport demand, demographic shifts, and public health.
  • Policy analysts (URA, MTI, EDB partners, consultancies) — bulk-pull dataset slices for ministry presentations and regression input without manually exporting CSVs from the portal.
  • Civic-tech & data-journalism hackathons — fast onboarding to Singapore open data for hack projects on dengue, MRT crowding, school admissions, or carpark availability.
  • BI / analytics consultancies — drop straight into BigQuery, Snowflake, DuckDB, or Postgres; categorize and tag each row at ingest time using category and dataset_name.

Quick Start

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
# HDB Resale Prices in Tampines, 5-room flats, latest 500 transactions
run = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={
"dataset_id": "d_8b84c4ee58e3cfc0ece0d773c8ca6abc",
"filters": {"town": "TAMPINES", "flat_type": "5 ROOM"},
"limit": 500,
"dataset_name": "HDB Resale Prices (2017+)",
"category": "Housing",
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["town"], item["flat_type"], item["floor_area_sqm"], item["resale_price"])

COE bidding history

run = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={
"dataset_id": "d_2d493bbc98a652d0009dd58172e2da80",
"limit": 1000,
"category": "Transport",
"dataset_name": "COE Bidding Results",
})

Returns the 1,000 most recent COE bidding rounds (categories A/B/C/D/E) with premium, bidding_no, vehicle_class, quota, bids_success, bids_received.

run = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={
"dataset_id": "d_8b84c4ee58e3cfc0ece0d773c8ca6abc",
"search_query": "Bukit Timah",
"limit": 200,
"category": "Housing",
})

The CKAN q= parameter matches across every column of the dataset, so this returns every HDB resale row that mentions Bukit Timah anywhere (town, street, block, etc.).

Inputs

FieldTypeRequiredDescription
dataset_idstringyesCKAN resource_id. Format d_<32-hex>.
filtersobjectnoColumn → value equality filters (e.g. {"town": "ANG MO KIO"}).
search_querystringnoFree-text search across all columns.
limitint 1–10,000noMax records (paginated internally in chunks of 1,000). Default 100.
dataset_namestringnoFriendly label stamped on every output row.
categoryenumnoOne of Transport / Environment / Economy / Demographics / Education / Health / Housing / Finance / Other.

Pricing

Pay-per-event, no subscription.

EventPrice
Actor start$0.01 / run
data.gov.sg record$0.02 / record

Examples:

  • 100-row preview: $2.01
  • 500-row HDB Tampines pull: $10.01
  • 1,000-row COE bidding history: $20.01
  • 10,000-row environmental scan: $200.01

For very-high-volume historical pulls (full HDB resale ≈ 230K rows), reach out and we'll quote a bulk arrangement.


Comparison vs the obvious alternatives

VendorSingapore data coverageBulk / programmaticCostSchema-stable JSON
NexGenData sg-datagov-multi-dataset1,000+ datasets via CKANYes — JSON, CSV, XLSX out of the box$0.01 + $0.02/rowYes
data.gov.sg portal (official)All of themUI only — CSV per dataset, no bulk endpoint exposed in the UIFreeN/A (manual)
Singstat Table Builder (official)SingStat subset (economy / demo)UI faceted, manual exportFreeN/A (manual)
EDB Singapore data hubCurated economic snapshotsMostly PDF / dashboardFreeNo
World Bank SingaporeMacro indicators onlyAPI present, but Singapore-specific micro datasets absentFreePartial
Statista — SingaporeRepublishes a subset behind a paywallLimited API$1,950+/year ProYes
CEIC / Trading EconomicsMacro + some microAPI with seat licence~$5,000+/year EnterpriseYes
Knoema / TheGlobalEconomyRepublished macroAPI with quota$200–$1,000+/moPartial

If you need just one Singapore macro indicator with a 10-year tail, the World Bank API is free. If you need 1,000+ datasets with bulk pulls, filters, and pipeline-ready JSON output — this actor is the right tool.


Sister Actors in the NexGenData Singapore Fleet

Pair this universal wrapper with our specialized Singapore actors when you want richer, opinionated schemas:

Use caseActor
MAS register of licensed banks, FIs, capital-markets services, and payment institutionssg-mas-financial-institutions
HDB resale with town-level enrichment, $/sqft, lease decaysg-hdb-resale-prices
URA private residential transactions (condos, landed)sg-ura-property-transactions
SGX IPO calendar and prospectus linkssg-sgx-ipo-calendar
ACRA Singapore company registry lookupsingapore-acra-companies
SGX listed-equity screener (Straits Times fundamentals, REITs, banks)sgx-singapore-stock-screener
HK Companies Registry — cross-border SG↔HK corporate researchhk-companies-registry

FAQ

Is this legal? Can I redistribute the data? Yes. All data is republished from data.gov.sg under the Singapore Open Data Licence. Attribution is required when you publish derivative work; the actor preserves data_source on every record so you can comply automatically. This actor is not endorsed by or affiliated with the Government of Singapore — it is an independent CKAN client.

How fresh is the data? Each request hits data.gov.sg live, no cached layer. Update cadence depends on the owning agency — HDB resale updates monthly, COE bidding twice a month, weather datasets daily, CPI monthly. The last_updated field on each row carries CKAN's own provenance timestamp.

The columns are different across datasets — how do I write stable ETL? That's the trade-off of a universal wrapper. Pin one dataset per Apify schedule, tag it with dataset_name + category, and treat each schedule as a separate ETL job. The five provenance fields (dataset_id, dataset_name, category, data_source, last_updated) are stable across all datasets and are safe to model as a parent table; the native columns go in a per-dataset child table.

Why does my filter return zero rows? CKAN filters is an exact-match equality predicate. Capitalization and whitespace must match the dataset's actual values. Run {"limit": 5} first with no filter to inspect actual column values, then add the filter. Some datasets use "ANG MO KIO" (with spaces) for town names; others use codes like AMK.

Resource IDs sometimes change — what happens then? Singapore agencies occasionally re-publish datasets with a new resource_id. If a query returns zero rows or a 404, confirm against data.gov.sg/datasets. The dataset name and content survive; only the ID changes.

Are there rate limits? data.gov.sg does not publish a strict rate limit on its CKAN endpoint; in practice it tolerates the actor's 1,000-row chunked pagination comfortably. For pulls > 50K rows we recommend spacing runs out by a few minutes.

Can I get data older than data.gov.sg publishes? No. The data.gov.sg portal is the source of record. For deeper history (e.g. 1990s HDB resale before the official series begins), there is no programmatic source — those rows simply do not exist in the public CKAN feed.

What output formats are supported? Apify dataset native — JSON, CSV, XLSX, RSS, HTML table. Stream directly into webhooks, Snowflake, BigQuery, DuckDB, Postgres, or any S3-compatible store. The apify_client SDK works for Python, Node, and any HTTP client.


About NexGenData

NexGenData publishes 200+ buyer-intent actors covering SEC filings, YC alumni, Delaware DOC, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, IPO calendars, ETF holdings, treasury yields, commodity futures, FX rates, crypto, and a deep Singapore / APAC government-data cluster. All pay-per-result, no subscriptions. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b.