# All About Birds Scraper - Cornell Lab Bird Species (`lulzasaur/allaboutbirds-scraper`) Actor

Scrape bird species data from AllAboutBirds.org (Cornell Lab). Extract names, taxonomy, habitat, food, nesting, conservation status, measurements, cool facts, sounds, and images for 700+ North American species.

- **URL**: https://apify.com/lulzasaur/allaboutbirds-scraper.md
- **Developed by:** [lulz bot](https://apify.com/lulzasaur) (community)
- **Categories:** E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## All About Birds Scraper

Scrape bird species data from [AllAboutBirds.org](https://www.allaboutbirds.org/guide/), Cornell Lab of Ornithology's comprehensive bird guide covering 700+ North American species.

### What does this scraper do?

This Actor scrapes species data from All About Birds, including:

- **Taxonomy**: Common name, scientific name, order, family
- **Life History**: Habitat, food, nesting behavior, behavior patterns
- **Conservation**: Conservation status and detailed conservation description
- **Measurements**: Length, weight, wingspan, relative size, size category
- **Appearance**: Size & shape description, color pattern description
- **Cool Facts**: Curated interesting facts about each species
- **Media**: Hero image URL, thumbnail image, sound/call recording URL
- **Nesting Facts**: Clutch size, egg length/width, incubation period, nestling period, egg description, hatching condition

### Input

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `searchQueries` | string[] | `["Bald Eagle"]` | Bird names to search for (partial match on common or scientific name). Use `*` or `all` for all species. |
| `maxListings` | integer | `100` | Maximum number of species to scrape. Set to 0 for unlimited. |
| `scrapeDetails` | boolean | `true` | Visit detail pages for full data (measurements, nesting, conservation, cool facts, sounds). |
| `proxyConfiguration` | object | `{}` | Proxy settings for large-scale runs. |

#### Example Input

```json
{
    "searchQueries": ["Bald Eagle", "Owl"],
    "maxListings": 10,
    "scrapeDetails": true
}
````

### Output

Each result includes:

```json
{
    "commonName": "Bald Eagle",
    "scientificName": "Haliaeetus leucocephalus",
    "order": "ACCIPITRIFORMES",
    "family": "Hawks, Eagles, and Kites",
    "familyScientific": "Accipitridae",
    "habitat": "Forests",
    "food": "Fish",
    "nesting": "Tree",
    "behavior": "Soaring",
    "conservationStatus": "Low Concern",
    "description": "The Bald Eagle has been the national emblem of the United States since 1782...",
    "coolFacts": ["Rather than do their own fishing...", "Had Benjamin Franklin prevailed..."],
    "measurements": {
        "length": "27.9-37.8 in (71-96 cm)",
        "weight": "105.8-222.2 oz (3000-6300 g)",
        "wingspan": "80.3 in (204 cm)"
    },
    "relativeSize": "One of the largest birds in North America...",
    "sizeCategory": "goose-sized or larger",
    "sizeAndShape": "...",
    "colorPattern": "...",
    "habitatDescription": "...",
    "foodDescription": "...",
    "nestingDescription": "...",
    "nestingFacts": {
        "Clutch Size": "1-3 eggs",
        "Number of Broods": "1 brood",
        "Egg Length": "2.3-3.3 in (5.8-8.4 cm)",
        "Incubation Period": "34-36 days"
    },
    "behaviorDescription": "...",
    "conservationDescription": "...",
    "soundUrl": "https://www.allaboutbirds.org/guide/assets/sound/549106.mp3",
    "heroImageUrl": "https://www.allaboutbirds.org/guide/assets/photo/649805587-1280px.jpg",
    "imageUrl": "https://www.allaboutbirds.org/guide/assets/photo/649805587-480px.jpg",
    "url": "https://www.allaboutbirds.org/guide/Bald_Eagle/overview",
    "scrapedAt": "2026-04-25T12:00:00.000Z"
}
```

### Example Queries

- `["Bald Eagle"]` - Scrape one specific species
- `["Eagle"]` - All species with "Eagle" in the name
- `["Owl"]` - All owl species
- `["Hummingbird", "Woodpecker"]` - Multiple bird groups
- `["Accipitridae"]` - Match by scientific family name
- `["*"]` - Scrape all 700+ species

### How it works

1. Fetches the full taxonomy browse page at `allaboutbirds.org/guide/browse/taxonomy`
2. Parses all 700+ species with their order, family, and scientific names
3. Filters by your search queries (partial match on common or scientific name)
4. If `scrapeDetails` is enabled, visits three pages per species:
   - **Overview page**: Description, conservation status, habitat/food/nesting icons, cool facts, sound URL
   - **ID page**: Measurements (length, weight, wingspan), size & shape, color pattern
   - **Life History page**: Detailed habitat, food, nesting facts, behavior, conservation text
5. Outputs structured JSON for each species

### Cost

With detailed scraping enabled, each species requires 4 HTTP requests (taxonomy + overview + ID + life history). A full scrape of all 700+ species uses approximately 2,800 requests. Without details, only 1 request is needed for the taxonomy page.

### Tips

- Start with a small `maxListings` to test (e.g., 5)
- Use `scrapeDetails: false` for quick taxonomy-only data
- The `*` query combined with `maxListings: 0` will scrape the entire database
- Sound URLs are direct MP3 links that can be downloaded

***

### Run on Apify

This scraper runs on the [Apify platform](https://apify.com/?fpr=lulzasaur) — a full-stack web scraping and automation cloud. Sign up for a free account to get started with 30-day trial of all features.

[Try Apify free ->](https://apify.com/?fpr=lulzasaur)

### Related Scrapers

More data scrapers and tools by [lulzasaur](https://apify.com/lulzasaur):

- [AbeBooks Scraper](https://apify.com/lulzasaur/abebooks-scraper) - Rare and used books
- [Bonanza Scraper](https://apify.com/lulzasaur/bonanza-scraper) - Online marketplace listings
- [Goodreads Scraper](https://apify.com/lulzasaur/goodreads-scraper) - Book ratings and reviews
- [IMDb Scraper](https://apify.com/lulzasaur/imdb-scraper) - Movie and TV show data
- [PSA Population Report](https://apify.com/lulzasaur/psa-pop-scraper) - Card grading data
- [Reverb Scraper](https://apify.com/lulzasaur/reverb-scraper) - Music gear marketplace
- [TCGPlayer Scraper](https://apify.com/lulzasaur/tcgplayer-scraper) - Trading card prices

# Actor input Schema

## `searchQueries` (type: `array`):

List of bird names to search for (partial match on common or scientific name). Use '\*' or 'all' to scrape all 700+ species. Examples: 'Bald Eagle', 'Owl', 'Haliaeetus'.

## `maxListings` (type: `integer`):

Maximum number of species to scrape. Set to 0 for unlimited.

## `scrapeDetails` (type: `boolean`):

If enabled, visits overview, ID, and life history pages for each species to get full data (description, measurements, nesting facts, conservation details, cool facts, sounds). If disabled, returns basic taxonomy data only.

## `proxyConfiguration` (type: `object`):

Proxy settings. Not required for basic scraping but recommended for large-scale runs.

## Actor input object example

```json
{
  "searchQueries": [
    "Bald Eagle"
  ],
  "maxListings": 100,
  "scrapeDetails": true,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQueries": [
        "Bald Eagle"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("lulzasaur/allaboutbirds-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "searchQueries": ["Bald Eagle"] }

# Run the Actor and wait for it to finish
run = client.actor("lulzasaur/allaboutbirds-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQueries": [
    "Bald Eagle"
  ]
}' |
apify call lulzasaur/allaboutbirds-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=lulzasaur/allaboutbirds-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "All About Birds Scraper - Cornell Lab Bird Species",
        "description": "Scrape bird species data from AllAboutBirds.org (Cornell Lab). Extract names, taxonomy, habitat, food, nesting, conservation status, measurements, cool facts, sounds, and images for 700+ North American species.",
        "version": "1.0",
        "x-build-id": "pwMtnnHM77BJdhUJb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/lulzasaur~allaboutbirds-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-lulzasaur-allaboutbirds-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/lulzasaur~allaboutbirds-scraper/runs": {
            "post": {
                "operationId": "runs-sync-lulzasaur-allaboutbirds-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/lulzasaur~allaboutbirds-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-lulzasaur-allaboutbirds-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "searchQueries"
                ],
                "properties": {
                    "searchQueries": {
                        "title": "Bird Names",
                        "type": "array",
                        "description": "List of bird names to search for (partial match on common or scientific name). Use '*' or 'all' to scrape all 700+ species. Examples: 'Bald Eagle', 'Owl', 'Haliaeetus'.",
                        "default": [
                            "Bald Eagle"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxListings": {
                        "title": "Max Species",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of species to scrape. Set to 0 for unlimited.",
                        "default": 100
                    },
                    "scrapeDetails": {
                        "title": "Scrape Full Details",
                        "type": "boolean",
                        "description": "If enabled, visits overview, ID, and life history pages for each species to get full data (description, measurements, nesting facts, conservation details, cool facts, sounds). If disabled, returns basic taxonomy data only.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings. Not required for basic scraping but recommended for large-scale runs.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
