Pricing

from $1.00 / 1,000 page extracteds

Hyper Reader

High-fidelity web extraction for AI agents. Clean Markdown optimized for Claude, GPT-4 & Gemini. 3-level stealth, Vision screenshots, Deep Read link following. Standby Mode for 1-second responses.

Pricing

from $1.00 / 1,000 page extracteds

Rating

0.0

(0)

Developer

Jason Pellerin

Actor stats

Bookmarked

Total users

Monthly active users

14 days ago

Last modified

🚀 Hyper-Reader: The Agentic Web Bridge

Stop feeding your LLM messy HTML. Hyper-Reader delivers high-fidelity, ad-free content optimized for Claude, GPT-4, and Gemini with sub-second response times.

Built by Jason Pellerin AI Solutionist — the same engineering behind enterprise AI voice agents and automation systems.

Why Hyper-Reader?

Problem	Hyper-Reader Solution
Raw HTML is noisy and token-expensive	Clean Markdown with smart content extraction
Anti-bot systems block your scrapers	3-level stealth with fingerprint randomization
Different LLMs need different formats	Agent-optimized presets (Claude, GPT, Gemini)
Cold starts kill your agent's speed	Standby Mode for 1-second responses
Single pages lack context	Deep Read follows links for comprehensive data

🎯 Agent Presets

Choose your target LLM for optimized output:

Claude (Default)

<document>
  <metadata>
    <title>Article Title</title>
    <author>John Doe</author>
    <published>2024-01-15</published>
    <source>https://example.com/article</source>
  </metadata>
  <content>
    # Main Heading
    
    Clean, structured Markdown content...
  </content>
</document>

GPT-4

# Article Title

> Source: https://example.com/article
> Author: John Doe | Published: 2024-01-15

Content with inline citations [1] and reference links...

---
## References
[1]: https://example.com/article "Original Source"

Gemini

Compact Markdown optimized for Gemini's context window with aggressive token optimization.

SearchGPT

Web-search optimized format with prominent source attribution and fact-checkable structure.

⚡ Standby Mode

Enable Standby Mode for instant API responses. Your Actor stays warm and ready:

# Response time: ~1 second vs 30+ seconds cold start
curl -X POST "https://YOUR_ACTOR_STANDBY_URL/extract" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "agentPreset": "Claude"}'

Perfect for:

Real-time AI assistants
MCP tool integrations
Cursor/Claude Desktop extensions
n8n and automation workflows

🛡️ Stealth Levels

Level 1: Basic

Standard datacenter proxies
Basic header rotation
Best for: Blogs, news sites, documentation

Level 2: Standard (Default)

Residential proxy rotation
Browser fingerprint randomization
WebGL/Canvas spoofing
Best for: E-commerce, social media, most protected sites

Level 3: Elite

Premium residential proxies
Human-like mouse movements
Session persistence
Full anti-fingerprinting
Best for: LinkedIn, Amazon, heavily protected sites

🔍 Deep Read

Gather comprehensive context by following internal links:

{
  "url": "https://example.com/product",
  "deepReadDepth": 2,
  "deepReadMaxPages": 10
}

Returns aggregated content from the main page plus related pages (About, FAQ, Reviews, etc.) in a single, structured document.

📸 Vision Screenshots

Capture page screenshots for Vision model analysis:

{
  "url": "https://example.com",
  "useVision": true
}

Returns a 1280x720 optimized PNG stored in Apify's Key-Value Store, perfect for GPT-4V, Claude Vision, or Gemini Pro Vision.

Input Schema

Field	Type	Default	Description
`url`	string	-	Target URL to extract
`urls`	array	-	Multiple URLs (batch mode)
`agentPreset`	enum	`Claude`	Output optimization target
`outputFormat`	enum	`markdown`	`markdown`, `json`, or `html_cleaned`
`stealthLevel`	integer	`2`	1-3 (Basic to Elite)
`useVision`	boolean	`false`	Capture screenshot
`deepReadDepth`	integer	`0`	Link following depth (0-3)
`waitForSelector`	string	-	CSS selector to wait for
`excludeSelectors`	string	-	Elements to remove (comma-separated)
`maxContentLength`	integer	`0`	Truncate output (0 = unlimited)

Output Structure

{
  "url": "https://example.com/article",
  "finalUrl": "https://example.com/article/",
  "format": "markdown",
  "agentPreset": "Claude",
  "content": "# Article Title\n\nClean markdown content...",
  "metadata": {
    "title": "Article Title",
    "author": "John Doe",
    "publishDate": "2024-01-15",
    "description": "Article description...",
    "wordCount": 1500,
    "readingTimeMinutes": 7
  },
  "screenshotUrl": "https://api.apify.com/v2/key-value-stores/.../screenshot.png",
  "processingTimeMs": 2340,
  "charCount": 8500,
  "extractedAt": "2024-01-15T10:30:00.000Z"
}

Use Cases

🤖 AI Agent Research

// Feed clean web data to your AI agent
const result = await client.call('ai_solutionist/hyper-reader', {
  url: 'https://docs.example.com/api',
  agentPreset: 'Claude'
});
// result.content is ready for your LLM context

📊 Competitive Intelligence

// Extract competitor pages with deep context
const result = await client.call('ai_solutionist/hyper-reader', {
  url: 'https://competitor.com/pricing',
  deepReadDepth: 2,
  agentPreset: 'GPT-4'
});

🔗 MCP Tool Integration

{
  "mcpServers": {
    "hyper-reader": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/mcp-apify"],
      "env": {
        "APIFY_TOKEN": "your_token",
        "ACTOR_ID": "ai_solutionist/hyper-reader"
      }
    }
  }
}

📰 News Aggregation

// Batch extract multiple articles
const result = await client.call('ai_solutionist/hyper-reader', {
  urls: [
    'https://news.site/article1',
    'https://news.site/article2',
    'https://news.site/article3'
  ],
  agentPreset: 'Gemini',
  outputFormat: 'json'
});

Pricing

Tier	Price	Features
Standard	$1 / 1,000 pages	Full extraction, all presets, Stealth 1-2
Elite	$5 / 1,000 pages	Stealth Level 3, residential proxies
Pro Monthly	$49 / month	Standby Mode, unlimited standard proxy

Support

Documentation: jasonpellerin.com/hyper-reader
Issues: Open an issue on GitHub
Enterprise: Contact Jason Pellerin

Built with 🔥 by Jason Pellerin AI Solutionist

Transforming web chaos into agent-ready intelligence.

Build timestamp: Sun Jan 18 16:29:53 MST 2026

Crawl4ai To Markdown Pro2

juryless_rainbow/crawl4ai-to-markdown-pro2

A high-performance web-to-markdown crawler for AI agents, optimized for LLM data extraction using Crawl4AI. Features stealth browsing and high-fidelity content extraction.

aaron jungs

Reader Mode

maged120/reader-mode

Maged

5.0

Web Scraper For Llms

abotapi/web-scraper-for-llms

Stealth web scraping engine built for LLMs. Converts any web page to clean markdown or HTML

AbotAPI

The Perfect Prompt Generator

anointment/the-perfect-prompt-generator

Stop getting average results from AI. Input a simple idea, and this Elite Actor engineers a "Mega-Prompt" optimized for GPT-4, Claude 3, and Gemini. Includes persona, constraints, and structural logic.

Anointment

Best Web Scraper API

crawlkit/crawlkit-scrape-api

Scrape any website and get clean markdown, HTML, metadata and links. Powered by CrawlKit.sh - supports stealth mode for anti-bot protection.

Crawlkit

Markdown Scraper (Stealth Browser)

thodor/markdown-scraper-stealth-browser

Scrape any website to clean markdown for LLMs, RAG, and AI agents. Stealth browser with ad blocking. Only visible content, no cookie banners or hidden menus. Split into header, body, and footer with links. Token-optimized compact output included.

Thodor

PDF AI Extractor MCP

devaditya/pdf-ai-extractor-mcp

Extracts text, tables, summaries, and structured data from any PDF using OpenAI, Google Gemini, or Claude. Supports bulk AI processing, clean JSON exports, and an AI-ready MCP mode for agent workflows.

lalithhh

Web-to-Markdown Generator for AI & RAG Pipelines

profitstack/web-to-markdown-generator-for-ai-rag-pipelines

Convert any website into clean, heading-based chunking, LLM-ready Markdown for RAG and AI agents.

Manas Mantri

LLM Web Scraper

incredible_moment/llm-scraper

Turn any website into structured JSON using AI. Supports OpenAI GPT-4 and Anthropic Claude. Built in Rust to minimize compute costs while waiting for LLM responses. Extract data without selectors.

Daniel Rosen

Website to Markdown

logiover/website-to-markdown

Convert any URL to clean Markdown for AI & RAG. Strips ads & junk for noise-free data. Perfect for OpenAI, Pinecone & LangChain. Advanced stealth browsing bypasses anti-bots. Blazing fast, token-efficient extraction for AI Agents and Vector Stores. Your essential AI Data Architect.