Internet Archive Search Scraper avatar

Internet Archive Search Scraper

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Internet Archive Search Scraper

Internet Archive Search Scraper

Searches the Internet Archive for digital items matching a query and extracts metadata including identifier, title, creator, date, media type, download count, and direct URL.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Donny

Donny

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

6 hours ago

Last modified

Categories

Share

What it does

Searches the Internet Archive for digital items matching a query and extracts metadata including identifier, title, creator, date, media type, download count, and direct URL.

This Apify actor automates the collection of data from a public API, extracting structured information and saving it directly into an Apify dataset. It handles pagination automatically, supports configurable result limits, and includes robust error handling with timeouts on all HTTP requests. The actor is designed for reliability: it validates inputs, applies sensible defaults, and produces a fallback record when no results are found, so your downstream workflows never receive an empty dataset.

Why use it

Manually collecting data from web APIs is tedious and error-prone. This actor eliminates that burden by running in the cloud on the Apify platform, where it can be scheduled, integrated with webhooks, or chained with other actors. Whether you are conducting market research, building a knowledge base, monitoring package ecosystems, or feeding data into an analytics pipeline, this actor gives you structured, ready-to-use JSON output with zero browser overhead. It uses lightweight HTTP requests instead of a full browser, which makes it fast and cost-effective.

Input parameters

ParameterTypeRequiredDefaultDescription
searchQuerystringYes"public domain books"The search term to use when querying the API.
maxResultsintegerNo100Maximum number of results to return (1 to 1000).

Output data

Each item in the output dataset contains the following fields:

  • identifier - Internet Archive item identifier.
  • title - Item title.
  • creator - Creator or author name.
  • date - Date associated with the item.
  • mediatype - Media type (texts, audio, video, etc.).
  • downloads - Number of downloads.
  • description - Item description.
  • url - Direct URL to the item on Internet Archive.

All string fields are null-checked; missing values are stored as null rather than undefined.

Example output

{
"identifier": "aliceinwonderland00carriala",
"title": "Alice's Adventures in Wonderland",
"creator": "Carroll, Lewis",
"date": "1866",
"mediatype": "texts",
"downloads": 50000,
"description": "Alice's Adventures in Wonderland by Lewis Carroll",
"url": "https://archive.org/details/aliceinwonderland00carriala"
}

Pricing

This actor is priced on a usage basis:

  • $0.01 per result returned in the dataset.
  • $0.005 per actor start (fixed platform fee).

For example, scraping 500 results would cost approximately $5.005. Apify provides free monthly credits for new users, so you can try the actor at no charge.

More scrapers from brave_paradise

Check out other useful scrapers built by brave_paradise:

Visit the brave_paradise profile on Apify to see the full catalogue of actors.