# Library of Congress Search Scraper (`crawlerbros/library-of-congress-search-scraper`) Actor

Searches the Library of Congress digital collections (loc.gov) - millions of digitized books, photos, maps, manuscripts. Free, no API key.

- **URL**: https://apify.com/crawlerbros/library-of-congress-search-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Agents, Automation, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Library of Congress Search Scraper

Search and retrieve items from the **Library of Congress** digital collections (loc.gov) — millions of digitized books, photographs, maps, manuscripts, audio recordings, and more. Free, no API key required.

### What does this actor do?

This actor lets you:
- **Search** across all LOC digital collections by keyword with optional format and date filters.
- **Browse** a specific collection (maps, photos, manuscripts, books, newspapers, etc.).
- **Fetch specific items** by their LOC item IDs for detailed metadata.

### Data Source

All data is retrieved from the [Library of Congress](https://www.loc.gov) public JSON API:
- **Search API**: `https://www.loc.gov/search/?q={query}&fo=json`
- **Collection API**: `https://www.loc.gov/collections/{collection}/?fo=json`
- **Item API**: `https://www.loc.gov/item/{id}/?fo=json`

All endpoints are freely accessible without authentication.

### Input

| Field | Type | Description |
|-------|------|-------------|
| `mode` | Select | `search`, `browseCollection`, or `byIds` |
| `query` | String | Search keywords (e.g. "american history photographs", "civil war maps") |
| `collection` | Select | Collection to browse: maps, photos, manuscripts, books, newspapers, audio, films, etc. |
| `onlineFormat` | Select | Filter by format: image, audio, video, pdf, online-text, web-page, map |
| `dateFrom` | Integer | Filter from this year (e.g. 1800) |
| `dateTo` | Integer | Filter to this year (e.g. 1950) |
| `itemIds` | Array | Specific LOC item IDs (for byIds mode) |
| `maxItems` | Integer | Max items to return (default: 50, max: 2000) |

#### Example Inputs

**Search for Civil War photographs:**
```json
{
  "mode": "search",
  "query": "civil war photographs",
  "onlineFormat": "image",
  "dateFrom": 1861,
  "dateTo": 1865,
  "maxItems": 50
}
````

**Browse the maps collection:**

```json
{
  "mode": "browseCollection",
  "collection": "maps",
  "dateFrom": 1800,
  "dateTo": 1900,
  "maxItems": 100
}
```

**Fetch specific items by ID:**

```json
{
  "mode": "byIds",
  "itemIds": ["2002719523", "2017769894"],
  "maxItems": 10
}
```

### Output

Each item in the dataset contains:

| Field | Description |
|-------|-------------|
| `id` | LOC item identifier |
| `url` | Direct URL to the item page |
| `title` | Item title |
| `description` | Description or summary |
| `subject` | Subject terms (up to 10) |
| `creator` | Primary creator or author |
| `contributor` | Additional contributors (up to 5) |
| `date` | Creation or publication date |
| `language` | Language code(s) |
| `type` | Item type (map, photograph, book, etc.) |
| `format` | File format(s) (up to 5) |
| `rights` | Rights and access information |
| `image_url` | Thumbnail or primary image URL |
| `online_formats` | Available online formats |
| `collection_name` | Name of the collection it belongs to |
| `location` | Geographic location(s) (up to 3) |
| `coordinates` | Geographic coordinates (if available) |
| `scrapedAt` | ISO 8601 timestamp of when data was scraped |

#### Example Output

```json
{
  "id": "2002719523",
  "url": "https://www.loc.gov/item/2002719523/",
  "title": "Map of Washington DC, 1861",
  "description": "Detailed topographical map of the nation's capital during the Civil War",
  "subject": ["maps", "Washington DC", "Civil War", "1861"],
  "creator": "U.S. Army Corps of Engineers",
  "date": "1861",
  "language": ["eng"],
  "type": "map",
  "format": ["image/jpeg"],
  "rights": "Public Domain",
  "image_url": "https://tile.loc.gov/image-services/iiif/map.jpg",
  "online_formats": ["image"],
  "collection_name": "Maps Collection",
  "location": ["Washington DC"],
  "coordinates": "38.89511,-77.03637",
  "scrapedAt": "2026-01-15T10:30:00+00:00"
}
```

### Frequently Asked Questions

**Is this free to use?**
Yes. The Library of Congress provides a completely free JSON API with no authentication required.

**What collections can I browse?**
Maps, Photos, Manuscripts, Books, Notated Music, Newspapers, Audio Recordings, Films & Videos, Legislation, Prints & Photographs, American Memory, and the general LOC Collection.

**How many items can I retrieve?**
Up to 2,000 items per run using the `maxItems` parameter.

**Can I filter by date?**
Yes — use `dateFrom` and `dateTo` with year values (e.g. 1800, 1950).

**Can I filter by format?**
Yes — use the `onlineFormat` filter to narrow to images, audio, video, PDFs, web pages, or maps.

**What are item IDs?**
Every LOC item has a unique identifier visible in its URL (e.g. `loc.gov/item/2002719523/`). Use these IDs in `byIds` mode to retrieve specific items.

**Does this work for newspaper searches?**
Yes — browse the `newspapers` collection or search with relevant keywords. For full newspaper content, consider using the [Chronicling America API](https://chroniclingamerica.loc.gov/about/api/).

### Use Cases

- Digital humanities research on LOC collections
- Building photo/map databases from historical records
- Tracking legislative documents and bill records
- Academic research using primary source materials
- Finding digitized manuscripts and rare books
- Geographic research using historical maps with coordinates
- Building educational datasets from public domain materials

# Actor input Schema

## `mode` (type: `string`):

What to fetch: full-text search, browse a collection, or fetch specific items by ID.

## `query` (type: `string`):

Keywords to search for (e.g. "american history photographs", "civil war maps"). Used in mode=search.

## `collection` (type: `string`):

Collection to browse in mode=browseCollection. Select from the list or enter a custom slug.

## `onlineFormat` (type: `string`):

Filter by available online format.

## `dateFrom` (type: `integer`):

Filter items from this year (e.g. 1800).

## `dateTo` (type: `integer`):

Filter items up to this year (e.g. 1950).

## `itemIds` (type: `array`):

Specific LOC item IDs to fetch (e.g. \["2002719523", "2017769894"]).

## `maxItems` (type: `integer`):

Maximum number of items to return.

## Actor input object example

```json
{
  "mode": "search",
  "query": "american history photographs",
  "collection": "photos",
  "onlineFormat": "",
  "itemIds": [],
  "maxItems": 50
}
```

# Actor output Schema

## `items` (type: `string`):

Dataset containing all scraped Library of Congress items.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "search",
    "query": "american history photographs",
    "collection": "photos",
    "onlineFormat": "",
    "itemIds": [],
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/library-of-congress-search-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "search",
    "query": "american history photographs",
    "collection": "photos",
    "onlineFormat": "",
    "itemIds": [],
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/library-of-congress-search-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "search",
  "query": "american history photographs",
  "collection": "photos",
  "onlineFormat": "",
  "itemIds": [],
  "maxItems": 50
}' |
apify call crawlerbros/library-of-congress-search-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/library-of-congress-search-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Library of Congress Search Scraper",
        "description": "Searches the Library of Congress digital collections (loc.gov) - millions of digitized books, photos, maps, manuscripts. Free, no API key.",
        "version": "1.0",
        "x-build-id": "JfRxFSoeuPuLbs7Cz"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~library-of-congress-search-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-library-of-congress-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~library-of-congress-search-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-library-of-congress-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~library-of-congress-search-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-library-of-congress-search-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "search",
                            "browseCollection",
                            "byIds"
                        ],
                        "type": "string",
                        "description": "What to fetch: full-text search, browse a collection, or fetch specific items by ID.",
                        "default": "search"
                    },
                    "query": {
                        "title": "Search query",
                        "type": "string",
                        "description": "Keywords to search for (e.g. \"american history photographs\", \"civil war maps\"). Used in mode=search.",
                        "default": "american history photographs"
                    },
                    "collection": {
                        "title": "Collection",
                        "enum": [
                            "maps",
                            "photos",
                            "manuscripts",
                            "books",
                            "notated-music",
                            "newspapers",
                            "audio",
                            "films",
                            "legislation",
                            "prints-photographs",
                            "american-memory",
                            "loc-collection"
                        ],
                        "type": "string",
                        "description": "Collection to browse in mode=browseCollection. Select from the list or enter a custom slug.",
                        "default": "photos"
                    },
                    "onlineFormat": {
                        "title": "Online format",
                        "enum": [
                            "",
                            "online-text",
                            "image",
                            "audio",
                            "video",
                            "pdf",
                            "web-page",
                            "map"
                        ],
                        "type": "string",
                        "description": "Filter by available online format.",
                        "default": ""
                    },
                    "dateFrom": {
                        "title": "Year from",
                        "minimum": 1000,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Filter items from this year (e.g. 1800)."
                    },
                    "dateTo": {
                        "title": "Year to",
                        "minimum": 1000,
                        "maximum": 2100,
                        "type": "integer",
                        "description": "Filter items up to this year (e.g. 1950)."
                    },
                    "itemIds": {
                        "title": "Item IDs (mode=byIds)",
                        "type": "array",
                        "description": "Specific LOC item IDs to fetch (e.g. [\"2002719523\", \"2017769894\"]).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 2000,
                        "type": "integer",
                        "description": "Maximum number of items to return.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
