🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat
Pricing
from $20.00 / 1,000 singapore data.gov.sg records
🇸🇬 Singapore data.gov.sg Scraper — HDB, COE, Singstat
Query any of 1000+ Singapore government open datasets (data.gov.sg) - transport, environment, economy, demographics, education, health, housing. One actor, one resource_id, official CKAN API.
Pricing
from $20.00 / 1,000 singapore data.gov.sg records
Rating
0.0
(0)
Developer
NexGenData
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🇸🇬 Singapore data.gov.sg Multi-Dataset Scraper — HDB Resale, COE, Singstat, MAS
A universal wrapper for data.gov.sg — the official Singapore Government open-data platform. One actor, one input field (dataset_id), and you can pull from 1,000+ official datasets spanning transport, environment, economy, demographics, education, health, housing, and finance.
This is the fastest path to programmatic, paginated, schema-tagged access for journalists, researchers, policy analysts, fintech builders, prop-tech founders, and data scientists working with Singapore public data. The actor wraps the CKAN datastore_search API so you don't have to build pagination, filtering, full-text search, or rate-limit handling yourself.
No authentication. No API keys. No anti-bot risk. This is an official Government of Singapore API, free for public use under the Singapore Open Data Licence.
Why this beats the data.gov.sg portal, Singstat Table Builder, EDB & paid statistics vendors
- The data.gov.sg portal is browse-only. It gives you a CSV download per dataset and a manual UI. No bulk pulls, no scheduled refresh, no programmatic filter, no pipeline-ready output. This actor fixes that.
- Singstat Table Builder is a UI, not an API. You click through facets to compose a table, then export. There is no clean CKAN-style endpoint behind it. Our actor goes straight to the underlying CKAN datastore_search.
- World Bank, Statista, Knoema, Trading Economics, CEIC all republish Singapore data behind $200–$5,000/month paywalls — the same data the Government of Singapore publishes free. You're paying them to copy a free firehose.
- Pay-per-row, not per-month. $0.01 per run start + $0.02 per record. A 500-row HDB Tampines pull is $10.01. A 10,000-row environmental scan is $200.01. Compare to the $129–$799/month "Singapore data" SKUs from paid statistics vendors.
- CKAN filtering, full-text search, and 1,000-row chunked pagination — all handled. Pass
{"town": "TAMPINES"}to filter HDB resale. Passsearch_query: "Bukit Timah"for free-text search across all columns. Setlimit: 10000to pull the full historical slice.
What You Get
Every record returned by the actor is flattened and tagged with provenance fields, then unioned with whatever the underlying dataset's native columns are.
| Field | Purpose |
|---|---|
dataset_id | The CKAN resource_id you queried (e.g. d_8b84c4ee58e3cfc0ece0d773c8ca6abc). |
dataset_name | Friendly label stamped on every row (e.g. HDB Resale Prices 2017+). |
category | One of Transport, Environment, Economy, Demographics, Education, Health, Housing, Finance, Other. |
data_source | Always data.gov.sg (CKAN datastore_search) — preserved for attribution. |
last_updated | Provenance timestamp from CKAN _links.start. |
_id | CKAN row primary key. |
…native columns… | Every column the underlying dataset publishes (e.g. for HDB resale: town, flat_type, block, street_name, storey_range, floor_area_sqm, flat_model, lease_commence_date, remaining_lease, resale_price, month). |
The native columns vary per dataset — that's the trade-off of a universal wrapper. Run with limit: 5 first to inspect the exact shape before wiring a downstream pipeline.
Popular datasets you can plug straight in
- Housing (HDB / URA) — HDB Resale Flat Prices (1990 → present, split across 5 vintage tables), HDB Property Info, HDB Carpark Info.
- Transport (LTA) — COE Bidding Results, Annual Motor Vehicle Population, Monthly Motor Vehicle Registrations, Public Transport Ridership, Bus Stops, MRT/LRT Station Codes, ERP Rates.
- Environment (NEA) — Monthly Rainfall, Daily Rainfall, Mean Air Temperature, Relative Humidity, Wind Speed, PSI (24-hour, by region), Dengue Cases, Beach Water Quality.
- Economy & Finance (MAS, MTI, SingStat) — Consumer Price Index (CPI), GDP by Industry, Unemployment Rate, Foreign Reserves, SIBOR rates, MAS Exchange Rates (end-of-period), Total Trade Imports & Exports.
- Demographics (SingStat, ICA, DOS) — Residents by Planning Area & Sex, Residents by Age Group, Births & Deaths, Marriages & Divorces, Population Trends.
- Education (MOE) — General Information of Schools, Subjects Offered by Schools, CCAs Offered.
- Health (MOH) — Weekly Infectious Disease Bulletin, Hospital Bed Capacity.
Browse the full catalogue at data.gov.sg/datasets and paste any resource_id into the actor's dataset_id field.
Use Cases
- Prop-tech founders & PropertyGuru / 99.co competitors — pull the full HDB Resale Flat Prices history (230K+ rows) and join to your listings, compute $/sqft, lease-decay-adjusted price, and town-level price indices without scraping URA's PDF reports.
- Singapore journalists & data desks (ST, CNA, Today, MOTHERSHIP) — paginate the COE bidding history, quarterly CPI prints, dengue weekly bulletins, or housing transaction records for stories on cost of living, transport policy, and public health.
- Fintech & wealth-tech (StashAway, Endowus, Syfe, Saxo) — refresh MAS Exchange Rates, SIBOR, and CPI directly from the source feed for in-app charts and macro panels.
- Academic researchers & think-tanks (IPS, LKYSPP, NUS, NTU, SUTD) — reproducible, dated, version-controlled pulls for working papers on housing affordability, transport demand, demographic shifts, and public health.
- Policy analysts (URA, MTI, EDB partners, consultancies) — bulk-pull dataset slices for ministry presentations and regression input without manually exporting CSVs from the portal.
- Civic-tech & data-journalism hackathons — fast onboarding to Singapore open data for hack projects on dengue, MRT crowding, school admissions, or carpark availability.
- BI / analytics consultancies — drop straight into BigQuery, Snowflake, DuckDB, or Postgres; categorize and tag each row at ingest time using
categoryanddataset_name.
Quick Start
from apify_client import ApifyClientclient = ApifyClient("YOUR_APIFY_TOKEN")# HDB Resale Prices in Tampines, 5-room flats, latest 500 transactionsrun = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={"dataset_id": "d_8b84c4ee58e3cfc0ece0d773c8ca6abc","filters": {"town": "TAMPINES", "flat_type": "5 ROOM"},"limit": 500,"dataset_name": "HDB Resale Prices (2017+)","category": "Housing",})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["town"], item["flat_type"], item["floor_area_sqm"], item["resale_price"])
COE bidding history
run = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={"dataset_id": "d_2d493bbc98a652d0009dd58172e2da80","limit": 1000,"category": "Transport","dataset_name": "COE Bidding Results",})
Returns the 1,000 most recent COE bidding rounds (categories A/B/C/D/E) with premium, bidding_no, vehicle_class, quota, bids_success, bids_received.
Free-text search
run = client.actor("nexgendata/sg-datagov-multi-dataset").call(run_input={"dataset_id": "d_8b84c4ee58e3cfc0ece0d773c8ca6abc","search_query": "Bukit Timah","limit": 200,"category": "Housing",})
The CKAN q= parameter matches across every column of the dataset, so this returns every HDB resale row that mentions Bukit Timah anywhere (town, street, block, etc.).
Inputs
| Field | Type | Required | Description |
|---|---|---|---|
dataset_id | string | yes | CKAN resource_id. Format d_<32-hex>. |
filters | object | no | Column → value equality filters (e.g. {"town": "ANG MO KIO"}). |
search_query | string | no | Free-text search across all columns. |
limit | int 1–10,000 | no | Max records (paginated internally in chunks of 1,000). Default 100. |
dataset_name | string | no | Friendly label stamped on every output row. |
category | enum | no | One of Transport / Environment / Economy / Demographics / Education / Health / Housing / Finance / Other. |
Pricing
Pay-per-event, no subscription.
| Event | Price |
|---|---|
| Actor start | $0.01 / run |
| data.gov.sg record | $0.02 / record |
Examples:
- 100-row preview: $2.01
- 500-row HDB Tampines pull: $10.01
- 1,000-row COE bidding history: $20.01
- 10,000-row environmental scan: $200.01
For very-high-volume historical pulls (full HDB resale ≈ 230K rows), reach out and we'll quote a bulk arrangement.
Comparison vs the obvious alternatives
| Vendor | Singapore data coverage | Bulk / programmatic | Cost | Schema-stable JSON |
|---|---|---|---|---|
| NexGenData sg-datagov-multi-dataset | 1,000+ datasets via CKAN | Yes — JSON, CSV, XLSX out of the box | $0.01 + $0.02/row | Yes |
| data.gov.sg portal (official) | All of them | UI only — CSV per dataset, no bulk endpoint exposed in the UI | Free | N/A (manual) |
| Singstat Table Builder (official) | SingStat subset (economy / demo) | UI faceted, manual export | Free | N/A (manual) |
| EDB Singapore data hub | Curated economic snapshots | Mostly PDF / dashboard | Free | No |
| World Bank Singapore | Macro indicators only | API present, but Singapore-specific micro datasets absent | Free | Partial |
| Statista — Singapore | Republishes a subset behind a paywall | Limited API | $1,950+/year Pro | Yes |
| CEIC / Trading Economics | Macro + some micro | API with seat licence | ~$5,000+/year Enterprise | Yes |
| Knoema / TheGlobalEconomy | Republished macro | API with quota | $200–$1,000+/mo | Partial |
If you need just one Singapore macro indicator with a 10-year tail, the World Bank API is free. If you need 1,000+ datasets with bulk pulls, filters, and pipeline-ready JSON output — this actor is the right tool.
Sister Actors in the NexGenData Singapore Fleet
Pair this universal wrapper with our specialized Singapore actors when you want richer, opinionated schemas:
| Use case | Actor |
|---|---|
| MAS register of licensed banks, FIs, capital-markets services, and payment institutions | sg-mas-financial-institutions |
| HDB resale with town-level enrichment, $/sqft, lease decay | sg-hdb-resale-prices |
| URA private residential transactions (condos, landed) | sg-ura-property-transactions |
| SGX IPO calendar and prospectus links | sg-sgx-ipo-calendar |
| ACRA Singapore company registry lookup | singapore-acra-companies |
| SGX listed-equity screener (Straits Times fundamentals, REITs, banks) | sgx-singapore-stock-screener |
| HK Companies Registry — cross-border SG↔HK corporate research | hk-companies-registry |
FAQ
Is this legal? Can I redistribute the data?
Yes. All data is republished from data.gov.sg under the Singapore Open Data Licence. Attribution is required when you publish derivative work; the actor preserves data_source on every record so you can comply automatically. This actor is not endorsed by or affiliated with the Government of Singapore — it is an independent CKAN client.
How fresh is the data?
Each request hits data.gov.sg live, no cached layer. Update cadence depends on the owning agency — HDB resale updates monthly, COE bidding twice a month, weather datasets daily, CPI monthly. The last_updated field on each row carries CKAN's own provenance timestamp.
The columns are different across datasets — how do I write stable ETL?
That's the trade-off of a universal wrapper. Pin one dataset per Apify schedule, tag it with dataset_name + category, and treat each schedule as a separate ETL job. The five provenance fields (dataset_id, dataset_name, category, data_source, last_updated) are stable across all datasets and are safe to model as a parent table; the native columns go in a per-dataset child table.
Why does my filter return zero rows?
CKAN filters is an exact-match equality predicate. Capitalization and whitespace must match the dataset's actual values. Run {"limit": 5} first with no filter to inspect actual column values, then add the filter. Some datasets use "ANG MO KIO" (with spaces) for town names; others use codes like AMK.
Resource IDs sometimes change — what happens then?
Singapore agencies occasionally re-publish datasets with a new resource_id. If a query returns zero rows or a 404, confirm against data.gov.sg/datasets. The dataset name and content survive; only the ID changes.
Are there rate limits? data.gov.sg does not publish a strict rate limit on its CKAN endpoint; in practice it tolerates the actor's 1,000-row chunked pagination comfortably. For pulls > 50K rows we recommend spacing runs out by a few minutes.
Can I get data older than data.gov.sg publishes? No. The data.gov.sg portal is the source of record. For deeper history (e.g. 1990s HDB resale before the official series begins), there is no programmatic source — those rows simply do not exist in the public CKAN feed.
What output formats are supported?
Apify dataset native — JSON, CSV, XLSX, RSS, HTML table. Stream directly into webhooks, Snowflake, BigQuery, DuckDB, Postgres, or any S3-compatible store. The apify_client SDK works for Python, Node, and any HTTP client.
About NexGenData
NexGenData publishes 200+ buyer-intent actors covering SEC filings, YC alumni, Delaware DOC, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, IPO calendars, ETF holdings, treasury yields, commodity futures, FX rates, crypto, and a deep Singapore / APAC government-data cluster. All pay-per-result, no subscriptions. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b.