GBIF Species & Occurrence API Scraper
Pricing
Pay per event
GBIF Species & Occurrence API Scraper
Extract biodiversity data from the Global Biodiversity Information Facility (GBIF). Dual-mode: species mode retrieves taxonomy, vernacular names, synonyms, and distributions; occurrence mode streams georeferenced observation records for species distribution modelling and conservation analysis.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract biodiversity data from the Global Biodiversity Information Facility (GBIF) — the world's largest aggregator of species taxonomy and georeferenced occurrence records.
This actor exposes a dual-mode interface:
- Species mode — paginate
species/search, then fan-out to 4 enrichment endpoints per taxon: vernacular names, synonyms, geographic distributions, and descriptions. Produces fully-denormalised taxonomy rows. - Occurrence mode — stream
occurrence/searchresults with geo coordinates, collection metadata, and observer info. Year-range chunking automatically bypasses GBIF's 100,000-offset cap for deep pulls.
No API key required. No proxy required. Pure public REST JSON.
Why GBIF?
GBIF indexes 1.4 million Plantae species and billions of georeferenced occurrence records contributed by natural-history museums, herbaria, and citizen-science platforms worldwide. The data is public-domain and updated continuously.
Who uses this actor:
- Ecological niche / species distribution modellers (SDM/MaxEnt workflows)
- Conservation NGOs building species-range databases
- AI training-data builders (botanical vision datasets, biodiversity LLMs)
- University labs needing bulk taxonomy or occurrence point extracts
- Agtech and GIS / land-use analytics teams
Modes
Species Mode (mode: "species")
Paginates https://api.gbif.org/v1/species/search and emits one row per taxon. With enrich: true (default), each taxon gets 4 additional API calls:
| Sub-resource | What it adds |
|---|---|
/species/{key}/vernacularNames | Common names in all languages |
/species/{key}/synonyms | Synonym names |
/species/{key}/distributions | Geographic range with establishment status |
/species/{key}/descriptions | Habitat, ecology, and conservation notes |
Set enrich: false to skip enrichment for faster bulk taxonomy extraction.
Occurrence Mode (mode: "occurrence")
Paginates https://api.gbif.org/v1/occurrence/search and emits one row per observation. Each row includes coordinates, collection metadata, observer, date, and basis of record.
100k offset cap: GBIF limits occurrence search offsets to 100,000 records per query. For deeper pulls, set yearFrom and yearTo — the actor automatically chunks by year, running one paginated query per year and concatenating results. A 10-year range with 100k records per year gives 1M+ records.
Input
| Field | Type | Default | Description |
|---|---|---|---|
mode | string | "species" | "species" or "occurrence" |
query | string | — | Free-text search (e.g. "Acer saccharum", "Quercus") |
higherTaxonKey | integer | — | GBIF backbone key for a higher taxon (6 = Plantae, 1 = Animalia) |
rank | string | — | Species-mode filter: SPECIES, GENUS, FAMILY, etc. |
taxonomicStatus | string | — | Species-mode filter: ACCEPTED, SYNONYM, DOUBTFUL |
datasetKey | string | — | Restrict to a specific GBIF dataset UUID |
countryCode | string | — | Occurrence-mode filter: ISO 3166-1 alpha-2 country code (e.g. "US") |
yearFrom | integer | — | Occurrence-mode: earliest year; enables year-chunking |
yearTo | integer | — | Occurrence-mode: latest year |
hasCoordinate | boolean | false | Occurrence-mode: only return records with coordinates |
enrich | boolean | true | Species-mode: fetch vernacular names, synonyms, distributions, descriptions |
maxItems | integer | 15 | Maximum records to return (0 = unlimited, requires a filter) |
Output
Each record is emitted as a flat JSON object. Fields unused by the current mode are null.
Species record example
{"record_type": "species","gbif_key": 3189834,"nub_key": 3189834,"scientific_name": "Acer saccharum Marshall","canonical_name": "Acer saccharum","authorship": "Marshall","rank": "SPECIES","taxonomic_status": "ACCEPTED","kingdom": "Plantae","phylum": "Tracheophyta","class": "Magnoliopsida","order": "Sapindales","family": "Sapindaceae","genus": "Acer","species": "Acer saccharum","dataset_key": "d7dddbf4-2cf0-4f39-9b2a-bb099caae36c","vernacular_names": "Sugar Maple [en] | Érable à sucre [fr] | Zuckerahorn [de]","synonyms": "Acer saccharophorum K.Koch | Saccharodendron saccharum (Marshall) Nieuwl.","distributions": "United States (NATIVE) | Canada (NATIVE)","descriptions": "[habitat] Mesic deciduous forests, often with beech and yellow birch","gbif_url": "https://www.gbif.org/species/3189834","scraped_at": "2026-05-18T10:00:00.000Z"}
Occurrence record example
{"record_type": "occurrence","gbif_key": 3189834,"scientific_name": "Acer saccharum Marshall","canonical_name": "Acer saccharum","rank": "SPECIES","kingdom": "Plantae","family": "Sapindaceae","occurrence_key": 4530178441,"decimal_latitude": 44.3601,"decimal_longitude": -72.6519,"country_code": "US","state_province": "Vermont","event_date": "2023-09-24","basis_of_record": "HUMAN_OBSERVATION","recorded_by": "John Doe","institution_code": "iNaturalist","coordinate_uncertainty_m": 12,"gbif_url": "https://www.gbif.org/occurrence/4530178441","scraped_at": "2026-05-18T10:00:00.000Z"}
Usage Examples
All Plantae species (ACCEPTED, no enrichment, fast)
{"mode": "species","higherTaxonKey": 6,"rank": "SPECIES","taxonomicStatus": "ACCEPTED","enrich": false,"maxItems": 10000}
Sugar Maple occurrences in the US with coordinates (2020-2024)
{"mode": "occurrence","query": "Acer saccharum","countryCode": "US","hasCoordinate": true,"yearFrom": 2020,"yearTo": 2024,"maxItems": 0}
Quercus genus with full enrichment
{"mode": "species","query": "Quercus","rank": "SPECIES","enrich": true,"maxItems": 500}
Rate Limits & Polite Use
GBIF does not publish a hard rate limit but asks heavy users to be considerate. This actor uses:
- 300 records per page (GBIF maximum for occurrence search)
- 200 ms delay between pages
- Up to 3 concurrent enrichment calls per species batch
- Identifies itself via
User-Agent: OrbLabs/gbif-species-occurrence-api-scraper
For very large runs (millions of records), consider using the GBIF download API for a pre-packaged export instead.
Pricing
Pay-per-result (PPE). You pay only for records actually extracted. No charge for idle time spent waiting on GBIF's API.
Notes
- GBIF data is licensed under CC BY 4.0 or CC0 depending on the contributing dataset. Always check individual dataset licenses before commercial use.
- The GBIF backbone (
datasetKey: d7dddbf4-2cf0-4f39-9b2a-bb099caae36c) is the authoritative taxonomy for 49M+ taxa. - Occurrence records are georeferenced observations contributed by iNaturalist, eBird, natural history collections, and citizen science platforms.