Pricing

from $20.00 / 1,000 tool calls

🕷️ Web Scraping MCP — AI Content Extraction

MCP server letting AI agents (Claude Desktop, Cursor, n8n, OpenAI Agents SDK) scrape any website, run Google searches, query Wikipedia, crawl pages, and parse HTML at LLM tool-call time. Universal pay-per-result web extraction — drop-in for RAG pipelines and research agents.

Pricing

from $20.00 / 1,000 tool calls

Rating

0.0

(0)

Developer

NexGenData

Actor stats

Bookmarked

Total users

Monthly active users

21 days ago

Last modified

🕸️ Web Scraping MCP Server — AI-Native Crawl, Google Search & URL Extraction

MCP (Model Context Protocol) server that gives any AI agent a generic web-scraping + Google-search tool surface. Crawl any URL, run a Google query, fetch + parse a page into clean markdown, or run a multi-page site crawl — all surfaced as MCP tools for Claude Desktop, Cursor, Cline, OpenAI custom GPTs, and any MCP-compatible client. Built as a drop-in alternative to Firecrawl, Browserbase, Bright Data Web Unlocker, and base-LLM web-search (which is rate-capped and shallow).

Why Web Scraping MCP Beats Firecrawl, Browserbase, Bright Data & Generic LLM Search

Feature	NexGenData Web Scraping MCP	Firecrawl	Browserbase	Bright Data Web Unlocker	Generic LLM (built-in search)
Cost	$0.002 / event, pay-per-event	$19+ / month base	$39+ / month base	$$$ enterprise contract	Free (shallow, rate-capped)
MCP-native	Yes — Claude / Cursor / Cline	Yes (separate offering)	Partial	No	No
Generic crawl any URL	Yes — Apify proxy pool	Yes	Yes	Yes	Limited
Google search results	Yes	Plan-gated	No	Yes	Capped + shallow
Markdown extraction	Yes	Yes	No (raw HTML)	No (raw HTML)	Limited
Site crawl (depth + sitemap)	Yes	Yes	Build it yourself	Build it yourself	None
Cloudflare / Captcha handling	Yes	Plan-gated	Plan-gated	Yes	None
AI-agent integration	Native MCP — any client	Native MCP	SDK only	SDK only	Built into client
Auth	Apify token	Firecrawl key	Browserbase key	Bright Data account	None
Monthly minimum	None	$19+	$39+	$$$	None

Most agent teams pick this MCP server because it is cheaper than Firecrawl / Browserbase for ad-hoc agent traffic, the only drop-in alternative to stitching scrape + Google-search + crawl into three separate tools, and ships clean markdown that base Claude / GPT-4 web search cannot return at the same depth. A research agent answers "summarize the top 10 Google results for 'GPT-5 release date'" with full-page extracts instead of capped snippets.

Tools Exposed via MCP

crawl_url — fetch + render a URL, return clean markdown + metadata
google_search — programmable Google search with location / language / SafeSearch
crawl_site — multi-page site crawl with depth + sitemap support
extract_links — pull all outbound + internal links from a URL
screenshot_url — render + return PNG screenshot (full page or viewport)
extract_structured — schema-guided field extraction from a URL

Use Cases

Research agents — go beyond LLM training cutoff with live web crawl
Competitive intel — daily competitor blog / pricing page diff via tool calls
RAG ingest pipelines — turn a URL list into clean markdown for embedding
Content monitoring — flag changes to a URL on a schedule via agent
News research — Google search + crawl-the-top-N pattern as one agent flow
SEO audits — programmatic crawl + audit of a competitor sitemap
Knowledge-base sync — pull external help docs into your own KB regularly

Connect to Claude Desktop

{
  "mcpServers": {
    "nexgendata-scrape": {
      "url": "https://nexgendata--web-scraping-mcp-server.apify.actor/mcp",
      "headers": { "Authorization": "Bearer YOUR_APIFY_TOKEN" }
    }
  }
}

Quick Start (Python)

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/web-scraping-mcp-server").call(run_input={
    "tool": "crawl_url",
    "params": {"url": "https://example.com/article", "format": "markdown"}
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

Pricing — Pay Per Tool Call

Actor start: $0.0001
Tool call: $0.0020

500 crawl + search calls = $1.00. No monthly minimum.

Use case	Actor
AI web scraper (LLM-formatted output)	ai-web-scraper
SEO web analysis MCP (Lighthouse + tech stack)	seo-web-analysis-mcp-server
Domain intelligence MCP (DNS / WHOIS / SSL)	domain-intelligence-mcp-server
Developer tools MCP (NPM + PyPI + StackOverflow)	developer-tools-mcp-server
News MCP (headline search across publishers)	news-mcp-server
Reddit MCP (post + comment search)	reddit-mcp-server
Academic research MCP (papers + citations)	academic-research-mcp-server
26-server gateway (scraping + 25 more)	enterprise-mcp-gateway
Google CSE replacement (programmable search)	google-cse-replacement
Google cache viewer	google-cache-viewer
Page speed analyzer (Lighthouse bulk)	page-speed-analyzer

FAQ

Q: Does it handle JavaScript-rendered pages? A: Yes — by default crawl_url runs a headless browser that executes JS. Static-only mode is available for speed.

Q: How does it deal with Cloudflare / captchas? A: Apify's anti-bot infrastructure + residential proxy pool absorbs most challenges transparently.

Q: Is there a rate limit? A: Per-actor concurrency is high; for very large crawls (10k+ pages) split into parallel runs for better throughput.

Q: Can my agent run a deep crawl with depth=5? A: Yes — crawl_site supports configurable depth, max pages, sitemap-driven discovery, and include / exclude URL patterns.

Q: How does this compare with Firecrawl? A: Firecrawl is a great dedicated crawler-MCP; this server is cheaper than Firecrawl for low-volume agent traffic and uses Apify's broader proxy pool. Pick whichever fits your traffic curve.

Q: Is scraping legal? A: Public pages are legal to fetch (per hiQ v. LinkedIn). We respect robots.txt and surface the upstream ToS to you — you're responsible for downstream usage of scraped content.

Input Example

This MCP server requires no run input — it is started by your MCP-compatible AI client (Claude Desktop, Cursor, Cline, Windsurf, n8n) and tools are invoked at reasoning time. If you do want to launch it directly via the Apify API for a test ping, pass an empty input:

{}

Optional configuration properties (all have sensible defaults):

{
  "debug": false,
  "maxToolCallSeconds": 90
}

Output Example

When invoked through an MCP client, each tool returns structured JSON that the LLM consumes directly. A representative tool response looks like:

{
  "tool": "search",
  "status": "ok",
  "items": [
    {
      "id": "abc-123",
      "title": "Sample result",
      "url": "https://example.com/listing/123",
      "summary": "One-line LLM-ready summary of the item",
      "fields": { "price": 749000, "location": "Austin, TX" }
    }
  ],
  "meta": { "count": 1, "ms": 812 }
}

The underlying Apify dataset stores the same items in CSV / JSONL / Excel export formats for downstream pipelines.

How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing — you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
Result / tool call: charged per item written to the default dataset or per MCP tool call
No charge for retries, internal proxy rotation, or failed sub-requests — those are absorbed by the platform

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link — you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

Apify console — point-and-click run
Apify API — REST + webhooks
Apify Python / JS SDKs — programmatic batch
Zapier, Make.com, n8n — official integrations
MCP — many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
Schedules — built-in cron for daily / weekly / monthly runs
Webhooks — POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome — high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata

✈️ Travel MCP — AI Hotel & Flight Search

nexgendata/travel-mcp-server

MCP server letting AI agents (Claude Desktop, Cursor, n8n, OpenAI Agents SDK) search hotels, flights, vacation rentals, and reviews across Booking.com, Airbnb, and TripAdvisor. Pay-per-result, no quotas — drop-in for AI travel concierges and itinerary builders.

NexGenData

Web Search MCP Server: Web, News, Answers for AI Agents

mrlarryjohnson/web-search-mcp-server

MCP server giving AI agents (Claude, Cursor, n8n, Make) multi-engine web search, news search, and DuckDuckGo instant answers. No API keys, pay per tool call, hosted — connect in 30 seconds.

Larry Johnson

📱 Social MCP — AI Content & Media Search

nexgendata/social-content-mcp-server

📱 Social media, Instagram, TikTok, Twitter content MCP server for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Search posts + profiles + hashtags + engagement metrics across social platforms via MCP — for social-listening AI workflows. Free tier available.

NexGenData

DACH Business Intelligence MCP Server

actorpilot/dach-business-intelligence-mcp

DACH business intelligence MCP server for AI agents. Search the German Handelsregister, monitor EU tenders from TED, and generate structured company research. Works with Claude, Codex, Cursor, VS Code, n8n, LangChain, OpenAI Agents SDK, Gemini, and other MCP clients.

S. Klein

Polymarket MCP Server: Live Market Tools for AI Agents

mrlarryjohnson/polymarket-mcp-server

MCP server for AI agents (Claude, Cursor, n8n, Make): search Polymarket markets, live prices, insider-scored whale trades, trader leaderboards, wallet P&L. Pay per tool call, no API keys.

Larry Johnson

🤖 ⭐ Review Intel MCP — AI Review Analysis

nexgendata/review-intelligence-mcp-server

⭐ Reviews, ratings, product reviews, customer feedback MCP server for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Aggregate + analyze reviews across Amazon, Yelp, Trustpilot, G2, App Store via MCP — for sentiment AI workflows. Free tier available.

NexGenData

🎬 YouTube MCP — AI Video Search & Transcripts

nexgendata/youtube-media-mcp-server

🎥 YouTube videos, channels, transcripts, comments MCP server for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Search videos + channel stats + transcripts + comments + trending via MCP — for video-content AI workflows. Free tier available.

NexGenData

💼 Jobs MCP — AI Salary & Career Search

nexgendata/job-market-mcp-server

💼 Jobs, job listings, Indeed, LinkedIn, salary data MCP server for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Search job postings + salaries + company hiring trends + skills demand across major boards via MCP. Free tier available.

NexGenData

🔴 Reddit MCP — AI Post & Comment Search

nexgendata/reddit-mcp-server

💬 Reddit posts, comments, subreddits, trending content MCP for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Browse + search subreddit feeds + top posts + comment threads + user history via MCP — for community-intel AI workflows. Free tier available.

NexGenData

📍 Google Maps MCP — AI Lead Gen & Search

nexgendata/google-maps-mcp-server

📍 Google Maps, places, business listings, geocoding MCP server for AI agents (Claude Desktop, Cursor, OpenAI Agents SDK, Vercel AI SDK). Search places + reviews + hours + lat/lng + nearby POIs via MCP — purpose-built for location-aware AI workflows. Free tier available.

NexGenData