Paleobiology Database Fossils Scraper
Pricing
from $6.00 / 1,000 results
Paleobiology Database Fossils Scraper
Search the Paleobiology Database by taxon name and pull every fossil occurrence beneath it. Returns taxon, rank, occurrence and collection IDs, geologic interval, early and late age in millions of years, country, coordinates, and formation. Filter by interval or country.
Pricing
from $6.00 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share

🦴 Paleobiology Database Fossils Scraper
🚀 Pull fossil occurrence records in seconds. Search any taxon and get back where it was found, how old it is, and which collection it came from, straight from the Paleobiology Database (PBDB).
🕒 Last updated: 2026-06-05 · 📊 24 fields per record · Global coverage · Keyless public data service
The Paleobiology Database is a public, scientist-curated archive of the fossil record, holding millions of occurrence records contributed by paleontologists worldwide. This Actor queries the PBDB data service by taxonomic name and returns clean, structured fossil occurrence records ready for analysis.
Coverage: Every taxonomic level is searchable, from a single genus like Tyrannosaurus up to an entire class like Mammalia. Each occurrence carries its taxon identification, geologic age in millions of years, country and coordinates, and the rock formation it was recovered from. Filter by geologic interval (period, epoch, or age) and by country to focus a query.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Paleontologists and geologists | Build occurrence datasets for a taxon or clade |
| Researchers and students | Map fossil distribution across time and place |
| Data scientists and educators | Feed biodiversity and macroevolution models |
| Museum and collection staff | Cross-reference specimens with literature |
📋 What the Paleobiology Database Fossils Scraper does
It calls the PBDB occurrences endpoint for a taxon name you provide and walks the returned records, returning one row per fossil occurrence. For every occurrence it captures the taxon name and rank, the occurrence and collection identifiers, the geologic interval and early/late age in millions of years (Ma), the reference identifier, the country, latitude and longitude, and the rock formation. Optional interval and country filters narrow the search.
🎬 Full Demo (🚧 Coming soon)
⚙️ Input
| Field | Type | Required | Description |
|---|---|---|---|
baseName | string | No (defaults to Tyrannosaurus) | Taxonomic name to search. Hierarchical, so a genus, family, or class returns every occurrence beneath it. |
interval | string | No | Named geologic interval such as Cretaceous, Maastrichtian, or Miocene. |
country | select | No | Restrict to one country by ISO code (for example US, CA, MN). |
maxItems | integer | No | Free users limited to 10. Paid users up to 1,000,000. |
Example 1 — all Tyrannosaurus occurrences:
{"baseName": "Tyrannosaurus","maxItems": 50}
Example 2 — Cretaceous dinosaurs from the United States:
{"baseName": "Dinosauria","interval": "Cretaceous","country": "US","maxItems": 200}
⚠️ Good to Know: PBDB queries are hierarchical. A broad name like
MammaliaorTrilobitacan match tens of thousands of occurrences, so setmaxItemsto keep runs focused. Geographic fields such as formation, group, member, and county are filled only when the original collection recorded them, so they may be empty on some occurrences.
📊 Output
| 🏷 Field | Description |
|---|---|
🦴 taxonName | Accepted taxonomic name of the occurrence |
🏷 taxonRank | Rank of the name (species, genus, family, and so on) |
🔎 identifiedName | Name as originally identified, when it differs |
🆔 occurrenceId | PBDB occurrence identifier |
🗂 collectionId | PBDB collection identifier |
🧬 taxonId | PBDB taxon identifier |
🧫 phylum | Phylum classification |
🐾 class | Class classification |
📚 order | Order classification |
👪 family | Family classification |
🦕 genus | Genus classification |
⏳ earlyInterval | Earliest geologic interval for the occurrence |
⌛ lateInterval | Latest geologic interval, when bounded |
⬆ earlyAgeMa | Early age boundary in millions of years |
⬇ lateAgeMa | Late age boundary in millions of years |
🌍 country | Country code where the fossil was found |
🗺 state | State or province |
🏘 county | County, when recorded |
📍 lat | Latitude in decimal degrees |
📍 lng | Longitude in decimal degrees |
🪨 formation | Geologic formation |
🗻 geologicGroup | Geologic group, when recorded |
🧱 member | Geologic member, when recorded |
📖 referenceId | PBDB bibliographic reference identifier |
🕒 scrapedAt | Timestamp the record was collected |
Real sample records:
[{"taxonName": "Tyrannosaurus rex","taxonRank": "species","occurrenceId": "occ:139292","collectionId": "col:11917","phylum": "Chordata","class": "Reptilia","family": "Tyrannosauridae","earlyInterval": "Late Maastrichtian","earlyAgeMa": 72.2,"lateAgeMa": 66,"country": "CA","state": "Alberta","formation": "Scollard","lat": 51.906399,"lng": -113.0289,"referenceId": "ref:4218"},{"taxonName": "Tyrannosaurus rex","taxonRank": "species","occurrenceId": "occ:139293","collectionId": "col:11918","phylum": "Chordata","class": "Reptilia","family": "Tyrannosauridae","earlyInterval": "Late Maastrichtian","earlyAgeMa": 72.2,"lateAgeMa": 66,"country": "CA","state": "Alberta","formation": "Scollard","lat": 51.933334,"lng": -113.23333,"referenceId": "ref:4205"},{"taxonName": "Tyrannosaurus rex","taxonRank": "species","occurrenceId": "occ:220009","collectionId": "col:22657","phylum": "Chordata","class": "Reptilia","family": "Tyrannosauridae","earlyInterval": "Late Campanian","earlyAgeMa": 83.6,"lateAgeMa": 72.2,"country": "CA","state": "Alberta","formation": "Dinosaur Park","lat": 50.727234,"lng": -111.524582,"referenceId": "ref:5721"}]
✨ Why choose this Actor
- Direct line to the PBDB data service, the reference archive for the fossil record.
- One clean row per occurrence, with classification, age, location, and formation already separated into fields.
- Hierarchical taxon search, so one query can pull a genus or an entire class.
- Interval and country filters to scope a study without post-processing.
- Ages returned numerically in millions of years, ready for plotting and modeling.
📈 How it compares to alternatives
| Approach | Effort | Structured output | Filters |
|---|---|---|---|
| This Actor | Enter a taxon name | Yes, 24 fields | Interval and country |
| Manual PBDB web download | Build query strings by hand | Partial | Manual |
| Copying from publications | Very high | No | None |
🚀 How to use
- Create a free Apify account using this sign-up link.
- Open the Paleobiology Database Fossils Scraper.
- Enter a taxon name in
baseName, for exampleTriceratops,Canis, orTrilobita. - Optionally set an
intervaland acountry, then choosemaxItems. - Run the Actor and collect your fossil occurrence dataset.
💼 Business use cases
Research and academia
| Need | How it helps |
|---|---|
| Build a taxon occurrence dataset | Pull every record for a clade in one run |
| Study diversity over time | Age fields support range and turnover analysis |
Education
| Need | How it helps |
|---|---|
| Teach the fossil record | Real occurrences with ages and locations |
| Student projects | Ready data for maps and timelines |
Collections and museums
| Need | How it helps |
|---|---|
| Cross-reference holdings | Match specimens to PBDB collections |
| Trace literature | Reference identifiers link back to sources |
Data and analytics
| Need | How it helps |
|---|---|
| Feed biodiversity models | Clean, typed fields per occurrence |
| Build dashboards | Coordinates and ages drive maps and charts |
🔌 Automating Paleobiology Database Fossils Scraper
Connect runs to Make, Zapier, Slack, Airbyte, GitHub Actions, or Google Drive through the Apify API and integrations to schedule queries and route fresh occurrence data wherever your team works.
🌟 Beyond business use cases
- Research: assemble occurrence sets for macroevolution and biogeography studies.
- Personal: explore where your favorite prehistoric animals once lived.
- Non-profit: support science outreach and museum education programs.
- Experimentation: prototype paleo data visualizations and maps.
🤖 Ask an AI assistant
Paste your dataset into ChatGPT, Claude, Perplexity, or Microsoft Copilot and ask it to summarize age ranges, cluster occurrences by formation, or draft a methods paragraph.
❓ Frequently Asked Questions
Is the data official? Yes, it comes from the Paleobiology Database public data service, curated by paleontologists.
Do I need an API key? No. The PBDB data service is public and keyless.
What does baseName accept? Any taxonomic name. The search is hierarchical and includes everything below the name.
How do the age fields work? earlyAgeMa and lateAgeMa give the older and younger boundaries of the occurrence in millions of years.
Why are some location fields empty? Formation, group, member, and county appear only when the original collection recorded them.
Can I filter by time? Yes, set interval to a period, epoch, or age such as Jurassic or Maastrichtian.
Can I filter by place? Yes, set country to an ISO code such as US, CN, or MN.
How many records can I get? Free runs return up to 10 records. Paid plans return up to 1,000,000.
What if a taxon has no occurrences? The run finishes with no records and a note to broaden the search.
Can I run it on a schedule? Yes, use Apify scheduling or any connected automation platform.
🔌 Integrate with any app
Use the Apify API, webhooks, and native integrations to push occurrence data into your own pipelines, databases, and notebooks.
🔗 Recommended Actors
- Macrostrat Geology Units Scraper for stratigraphic units by location.
- GAIA Star Catalog Scraper for astronomical catalog data.
- JPL Small-Body Database Scraper for asteroid and comet records.
💡 Pro Tip: browse the complete ParseForge collection.
🆘 Need Help? Open our contact form
⚠️ Disclaimer: independent tool, not affiliated with the Paleobiology Database. Only publicly available data is collected.