Epstein Files Scraper, Downloader & Search API avatar

Epstein Files Scraper, Downloader & Search API

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Epstein Files Scraper, Downloader & Search API

Epstein Files Scraper, Downloader & Search API

Fast search, extract, and structure Epstein files with keyword-based discovery, automatic PDF text parsing, and AI-ready output.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Lofomachines

Lofomachines

Maintained by Community

Actor stats

1

Bookmarked

58

Total users

8

Monthly active users

0.62 hours

Issues response

6 days ago

Last modified

Share

Epstein Files Search, Count & Extraction API

Search public Epstein-related files by keyword, count matching files in seconds, or extract detailed file records with AI-ready text.

This Apify Actor is built for no-code users, researchers, journalists, OSINT teams, and automation builders who want a simple way to query the Justice.gov Epstein document index without writing scraping logic.

Why People Use This Actor

  • Find relevant Epstein files by one or more keywords
  • Get a fast count of total matching files before launching larger workflows
  • Extract structured file records with highlights and text previews
  • Feed clean results into AI tools, spreadsheets, databases, and automations

Two Simple Modes

1. Fetch matching files and text

Use this mode when you want the actual dataset rows for matching files, including metadata, snippets, and extracted text when available.

Best for:

  • investigative research
  • legal review
  • AI summarization pipelines
  • bulk export to Airtable, Sheets, Notion, or BI tools

2. Count total files per keyword

Use this mode when you only want to know how many files are available for each keyword.

Best for:

  • quick demand checks
  • low-cost keyword validation
  • pre-flight checks before large extraction runs
  • dashboards and monitoring workflows

When this mode is used, each returned keyword count is pushed with the pricing event name count-keyword.

What You Get

  • Multi-keyword search in one run
  • Fast count-only mode with one result row per keyword
  • Detailed extraction mode with file metadata and text previews
  • Automatic PDF text parsing when readable text is available
  • Structured dataset output ready for no-code tools and AI workflows
  • Proxy-ready setup for better reliability against blocking

Input Example: Detailed Extraction

{
"mode": "fetch-details",
"keywords": [
"dentist",
"table"
],
"maxItems": 20,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": []
}
}

maxItems behavior in detailed mode:

  • 20 = up to 20 detailed results for each keyword
  • 0 = fetch all detailed results for each keyword

Input Example: Count Only

{
"mode": "count-only",
"keywords": [
"pinocchio",
"massage"
],
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": []
}
}

In count-only mode, maxItems is ignored because the Actor returns one count row per keyword.

Output Example: Detailed Extraction

{
"mode": "fetch-details",
"eventName": null,
"keyword": "epstein flight logs",
"page": 1,
"documentId": "doc_123",
"chunkIndex": 0,
"originFileName": "EFTA01638670.pdf",
"originFileUri": "https://www.justice.gov/epstein/files/DataSet%2010/EFTA01638670.pdf",
"sourceContentType": "application/pdf",
"extractedText": "This is a parsed text preview from the original PDF...",
"highlight": [
"...keyword match snippet..."
],
"processedAt": "2026-01-01T10:00:00Z",
"indexedAt": "2026-01-01T10:05:00Z"
}

Output Example: Count Only

{
"mode": "count-only",
"eventName": "count-keyword",
"keyword": "pinocchio",
"totalAvailableFiles": 17,
"totalMatchingChunks": 17,
"countedAt": "2026-03-18T16:48:11Z"
}

totalAvailableFiles is the important count for this mode. It is based on the unique file aggregation returned by the source, so it reflects total matching files rather than individual text chunks whenever that aggregation is available.

Great Fit For No-Code Workflows

You can use this Actor as a drop-in data source for:

  • n8n
  • Make
  • Zapier
  • Google Sheets
  • Airtable
  • Notion
  • custom webhook pipelines
  • LLM and RAG workflows

Typical flow:

  1. Run the Actor with one or more keywords
  2. Read the dataset output
  3. Filter by keyword, file name, or total count
  4. Send the results into your app, sheet, or AI workflow

High-Value Use Cases

  • Validate whether a keyword has enough matching files before buying or running a large extraction
  • Build recurring monitors for specific names, places, organizations, or phrases
  • Enrich investigations with file names, highlights, and extracted text
  • Create keyword intelligence dashboards with low-friction count lookups

Quick Start

  1. Open the Actor on Apify
  2. Choose your mode
  3. Enter one or more keywords
  4. Run the Actor
  5. Use the dataset output directly in your workflow

Data Source Note

This Actor is designed to process publicly accessible document sources and return structured outputs for analysis, research, and automation.

Discover More Actors

If you want more ready-to-use scrapers and automation tools, explore the rest of the catalog here:

Discover more actors by Lofomachines