Pricing

from $1.00 / 1,000 results

Epstein Files Scraper, Downloader & Search API

Fast search, extract, and structure Epstein files with keyword-based discovery, automatic PDF text parsing, and AI-ready output.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Lofomachines

Actor stats

Bookmarked

Total users

Monthly active users

0.42 hours

Issues response

3 months ago

Last modified

Epstein Files Search, Count & Extraction API

Search public Epstein-related files by keyword, count matching files in seconds, or extract detailed file records with AI-ready text.

This Apify Actor is built for no-code users, researchers, journalists, OSINT teams, and automation builders who want a simple way to query the Justice.gov Epstein document index without writing scraping logic.

Why People Use This Actor

Find relevant Epstein files by one or more keywords
Get a fast count of total matching files before launching larger workflows
Extract structured file records with highlights and text previews
Feed clean results into AI tools, spreadsheets, databases, and automations

Two Simple Modes

1. Fetch matching files and text

Use this mode when you want the actual dataset rows for matching files, including metadata, snippets, and extracted text when available.

Best for:

investigative research
legal review
AI summarization pipelines
bulk export to Airtable, Sheets, Notion, or BI tools

2. Count total files per keyword

Use this mode when you only want to know how many files are available for each keyword.

Best for:

quick demand checks
low-cost keyword validation
pre-flight checks before large extraction runs
dashboards and monitoring workflows

When this mode is used, each returned keyword count is pushed with the pricing event name count-keyword.

What You Get

Multi-keyword search in one run
Fast count-only mode with one result row per keyword
Detailed extraction mode with file metadata and text previews
Automatic PDF text parsing when readable text is available
Structured dataset output ready for no-code tools and AI workflows
Proxy-ready setup for better reliability against blocking

Input Example: Detailed Extraction

{
  "mode": "fetch-details",
  "keywords": [
    "dentist",
    "table"
  ],
  "maxItems": 20,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": []
  }
}

maxItems behavior in detailed mode:

20 = up to 20 detailed results for each keyword
0 = fetch all detailed results for each keyword

Input Example: Count Only

{
  "mode": "count-only",
  "keywords": [
    "pinocchio",
    "massage"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": []
  }
}

In count-only mode, maxItems is ignored because the Actor returns one count row per keyword.

Output Example: Detailed Extraction

{
  "mode": "fetch-details",
  "eventName": null,
  "keyword": "epstein flight logs",
  "page": 1,
  "documentId": "doc_123",
  "chunkIndex": 0,
  "originFileName": "EFTA01638670.pdf",
  "originFileUri": "https://www.justice.gov/epstein/files/DataSet%2010/EFTA01638670.pdf",
  "sourceContentType": "application/pdf",
  "extractedText": "This is a parsed text preview from the original PDF...",
  "highlight": [
    "...keyword match snippet..."
  ],
  "processedAt": "2026-01-01T10:00:00Z",
  "indexedAt": "2026-01-01T10:05:00Z"
}

Output Example: Count Only

{
  "mode": "count-only",
  "eventName": "count-keyword",
  "keyword": "pinocchio",
  "totalAvailableFiles": 17,
  "totalMatchingChunks": 17,
  "countedAt": "2026-03-18T16:48:11Z"
}

totalAvailableFiles is the important count for this mode. It is based on the unique file aggregation returned by the source, so it reflects total matching files rather than individual text chunks whenever that aggregation is available.

Great Fit For No-Code Workflows

You can use this Actor as a drop-in data source for:

n8n
Make
Zapier
Google Sheets
Airtable
Notion
custom webhook pipelines
LLM and RAG workflows

Typical flow:

Run the Actor with one or more keywords
Read the dataset output
Filter by keyword, file name, or total count
Send the results into your app, sheet, or AI workflow

High-Value Use Cases

Validate whether a keyword has enough matching files before buying or running a large extraction
Build recurring monitors for specific names, places, organizations, or phrases
Enrich investigations with file names, highlights, and extracted text
Create keyword intelligence dashboards with low-friction count lookups

Quick Start

Open the Actor on Apify
Choose your mode
Enter one or more keywords
Run the Actor
Use the dataset output directly in your workflow

Data Source Note

This Actor is designed to process publicly accessible document sources and return structured outputs for analysis, research, and automation.

Discover More Actors

If you want more ready-to-use scrapers and automation tools, explore the rest of the catalog here:

Discover more actors by Lofomachines

Pdf Text Extractor Pro

dainty_screw/pdf-text-extractor-pro

PDF Text Extractor lets you quickly extract text from PDF files with high accuracy. Supports text chunking for AI, chatbots, and large language models (LLMs), making PDF-to-text conversion fast, clean, and ready for NLP or machine learning.

codemaster devops

5.0

AI Data Extraction from PDF

actor4you/ai-data-extraction-from-pdf

Extract text data from PDF files using AI. Upload PDFs directly or provide URLs. Supports text chunking for LLM workflows.

Actor4you

Extract text from PDF

akash9078/pdf-text-extractor

Efficiently extract text content from PDF files, ideal for data processing, content analysis, and automation workflows. Supports various PDF structures and outputs clean, readable text.

Akash Kumar Naik

107

PDF Text Extractor

jirimoravcik/pdf-text-extractor

PDF Text Extractor allows you to extract text from PDF files. It also supports chunking of the text to prepare the data for usage with large language models.

Jiří Moravčík

1.1K

PDF Scraper

onidivo/pdf-scraper

Scrape and extract text from PDF links.

Onidivo Technologies

512

PDF Text Extractor

automation-lab/pdf-text-extractor

Extract text, metadata, and page-by-page content from PDF files. Provide PDF URLs and get structured JSON with full text, per-page text, page count, author, title, creation date, and more. Export as JSON, CSV, or Excel. No browser or proxy needed.

Stas Persiianenko

Pdf API

vivid_astronaut/pdf

Fabio Suizu

Pdf to json

shahabuddin38/pdf-to-json

Convert PDF files into structured JSON with optional OCR, table extraction, key-value detection, and metadata parsing. Ideal for invoices, receipts, contracts, statements, forms, and document automation workflows. Supports digital and scanned PDFs for API-ready data extraction.

Shahab Uddin

Pdf Scraper

webscrap18/pdf-scraper

A high-performance Apify Actor that inspects, classifies, and extracts structured data from PDF files. It intelligently detect whether a PDF is text-based or scanned and converts it into clean, formatted Markdown.

WebScrap

Video Converter API

obedient_hierarchy/universal-video-converter

Reliable FFmpeg-based Apify Actor that converts video and audio files between safe presets, validates input/output metadata, and stores converted files in the default key-value store.