Pricing

from $14.00 / 1,000 result items

Library of Congress Scraper

Export records from the US Library of Congress catalog of 170M+ items. Search books, audio, film, maps, manuscripts, newspapers, photos, sheet music, and web archives. Pull titles, contributors, dates, subjects, languages, image URLs, and direct catalog links.

Pricing

from $14.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

🏛️ Library of Congress Scraper

🚀 Export the world's largest cultural archive in seconds. Search 170,000,000+ digitized items at the US Library of Congress across 11 format types, including books, audio, film, maps, manuscripts, newspapers, photos, sheet music, and web archives. No login, no manual harvesting.

The Library of Congress Scraper queries the LOC digital catalog and returns 18 structured fields per record, including title, contributors, date, subjects, languages, format, mediums, rights, repository info, and direct links to resource files and image derivatives. The LOC has been digitizing its holdings since the 1990s and exposes the world's most comprehensive open cultural catalog.

The catalog spans books and printed material, audio recordings, films, maps, manuscripts, historical newspapers, photographs, sheet music, notated music, web archives, and curated collections. This Actor returns the data as CSV, Excel, JSON, or XML in under five minutes, with year-range, language, and collection filters applied server-side.

🎯 Target Audience	💡 Primary Use Cases
Historians, archivists, journalists, educators, documentary producers, genealogists, digital humanities researchers, museum curators	Source primary documents, build classroom packs, enrich research databases, locate rights-cleared media, map historical newspapers, source public-domain images

📋 What the Library of Congress Scraper does

Five archival workflows in a single run:

📚 Format-scoped search. Pick one of 11 LOC formats (books, audio, film, maps, manuscripts, newspapers, photos, sheet music, web archives, notated music, collections).
🔎 Keyword search. Free-text search across the chosen format.
🌐 Language filter. Restrict to a single language slug (e.g. english, spanish, french, chinese, arabic).
📅 Date range. Earliest and latest year inclusive, for time-bounded research.
🗂️ Collection filter. Restrict to a curated LOC collection slug (e.g. wpa-life-histories, civil-war-maps).

Each record includes the LOC item ID, title, description, contributor list, date, subject tags, language list, format and medium, parent collection, repository, rights statement, every resource URL (manifests, audio, video, IIIF images), and a primary image thumbnail.

💡 Why it matters: the LOC catalog is the foundational reference for American cultural and political history. Building your own harvester means navigating multiple catalog endpoints, parsing nested metadata, and chasing pagination across millions of records. This Actor turns the entire catalog into a download.

📊 Data fields

Each record includes: contributors, createdPublished, date, description, digitalIds, format, imageUrl, imageUrls, itemId, languages, mediums, notes, partof, researchCenters, resourceUrls, scrapedAt, sourceCollection, subjects, title, topics, url. These field names come straight from the actor's dataset schema, so what you see here is what lands in your dataset.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Library of Congress Scraper page on the Apify Store.
🎯 Set input. Pick a format, add a keyword, set optional language, year range, or collection.
🚀 Run it. Click Start and let the Actor collect catalog records.
📥 Download. Grab your results from the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.

🔗 Recommended Actors

📚 LibriVox Audiobooks Scraper - Public-domain audiobooks with reader credits
🗣️ Tatoeba Sentence Corpus Scraper - 12M+ multilingual example sentences
🎨 Met Museum Scraper - Open-access artworks from The Met
📰 ArXiv Scraper - Academic preprints with metadata
📖 Figshare Scraper - Open research datasets and figures

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the US Library of Congress. All trademarks mentioned are the property of their respective owners. Only publicly available catalog data is collected. Honor each item's individual rights statement.

🆘 Need Help?

If you hit a bug, have questions about setup, or need a scraper we haven't built yet, open our contact form or write to parseforge@protonmail.com. We also take on paid custom data projects.

For faster answers, join our Discord. It's the best place to get support and suggest new actors.

Library of Congress Scraper - Collection Search

benthepythondev/library-of-congress-scraper

Search Library of Congress collections and export titles, dates, contributors, descriptions, subjects, formats, images and links.

Ben

Library of Congress Search Scraper

crawlerbros/library-of-congress-search-scraper

Searches the Library of Congress digital collections (loc.gov) - millions of digitized books, photos, maps, manuscripts. Free, no API key.

Crawler Bros

Library of Congress Search Scraper

crawlergang/library-of-congress-search-scraper

Searches the Library of Congress digital collections (loc.gov) - millions of digitized books, photos, maps, manuscripts. Free, no API key.

Crawler Gang

5.0

Loc Scraper

fortuitous_pirate/loc-scraper

Search and scrape digitized collections from the Library of Congress: books, maps, photographs, newspapers, manuscripts, and audio recordings.

Fortuitous Pirate

US Congress Members Scraper

crawlerbros/us-congress-members-scraper

Browse US Congress members, bills, and votes via the free Congress.gov API - no auth or proxy required.

Crawler Bros

Open Library Books Scraper

klondikeking/open-library-books-scraper

Pierrick McD0nald

Congress Bill Search

ryanclinton/congress-bill-search

Search and retrieve US congressional bills and legislation from the official Congress.gov API v3. This Apify actor provides structured, machine-readable data on every bill introduced in the United States Congress, from the 1st Congress in 1789 through the current 119th session.

Ryan Clinton

Congress.gov Bill Tracker - Bills, Votes, Sponsors & Subjects

jungle_synthesizer/congress-gov-bill-tracker

Track U.S. Congress bills via the official Congress.gov API. Extracts bill details, sponsors, cosponsors with party/state breakdown, committee assignments, policy subjects, and latest actions. Filter by congress number, bill type, or updatedSince for incremental runs.

BowTiedRaccoon

Congress.gov Bills Scraper | US Federal Legislation Export

parseforge/congress-gov-bills-scraper

Export US House and Senate bills from congress.gov: number, title, chamber, latest action, update date and direct URL. Filter by Congress number and bill type (HR, S, HJRES, SJRES and resolutions). CSV, Excel, JSON or XML for legislative tracking and policy research.

ParseForge

Open Library Books Scraper

scrapers_lat/openlibrary-scraper

Scrape books with title, authors, first publish year, edition count, ISBN, cover image, subjects and a direct link. Search by keyword. Export to JSON, CSV or Excel.