Pricing

Pay per usage

Api Surface Mapper

An Apify Actor that discovers a website’s API surface by capturing browser network traffic (`fetch`/`xhr`), grouping similar requests into endpoint candidates, scoring them, and generating ready-to-run replay snippets (curl + TypeScript fetch).

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Nikita Chapovskii

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

API Surface Mapper (Crawlee + Playwright)

An Apify Actor that discovers a website’s API surface by capturing browser network traffic (fetch/xhr), grouping similar requests into endpoint candidates, scoring them, and generating ready-to-run replay snippets (curl + TypeScript fetch).

This is API discovery, not HTML scraping: point it at a site, optionally perform a few interactions, and get a ranked list of endpoints the UI is calling.

What it does

For each visited page, the Actor:

Navigates using PlaywrightCrawler (Crawlee performs navigation automatically).
Attaches a network tap before navigation to capture early fetch/xhr requests.
Optionally runs a flow (steps) to trigger pagination, infinite scroll, filters, “Load more”, etc.
Waits until the network becomes quiet (no new fetch/xhr for quietMs).
Builds endpoint candidates:
- normalizes URLs
- patternizes volatile segments (IDs, tokens, etc.)
- groups exchanges by endpoint pattern + method + kind
Classifies candidates as REST / GraphQL / Other.
Scores candidates and outputs the top-N with replay snippets.

Key features

Captures fetch and xhr requests (configurable).
Optional JSON response sampling (size-limited).
GraphQL detection from request body (query / operationName) even if endpoint is not /graphql.
Endpoint grouping via URL patternization so you get “unique endpoints”, not a dump of raw URLs.
Generates replay snippets:
- curl
- fetch (TypeScript)
Optional link crawling via enqueueLinks().

Output (Dataset)

For each processed page, the Actor stores an item like:

{
    "pageUrl": "https://example.com/",
    "timestamp": "2026-01-07T15:49:37.257Z",
    "stats": { "exchangesCaptured": 12, "candidates": 7 },
    "candidates": [
        {
            "patternUrl": "https://api.example.com/v1/items?page=*",
            "method": "GET",
            "kind": "REST",
            "score": 55,
            "sample": {
                "url": "https://api.example.com/v1/items?page=1",
                "status": 200,
                "contentType": "application/json",
                "requestHeaders": { "...": "..." },
                "requestBody": null,
                "responseBody": { "...": "..." }
            },
            "generated": {
                "curl": "curl ...",
                "tsFetch": "const res = await fetch(...)"
            }
        }
    ]
}

Input

The input is intentionally flat and simple.

startUrls (required): array of start URLs. Accepts both:

* "https://example.com"
* { "url": "https://example.com" }

Crawling

maxRequests (default: 20): maximum number of pages to process.
enqueueLinks (default: false): whether to discover and enqueue links from each page.
strategy (default: "same-hostname"): crawling strategy for links:
- "same-hostname" | "same-domain" | "all"
globs (optional): allowlist patterns for links to enqueue.
linkSelector (default: "a[href]"): selector used by enqueueLinks().

Capture

captureTypes (default: ["xhr","fetch"]): which request types to capture.
maxExchangesPerPage (default: 250): hard cap of captured exchanges per page.
includeResponseBodies (default: false): if true, attempts to parse JSON responses and store a sample.
maxBodyKb (default: 256): JSON body size limit (best-effort).

Settle / timing

quietMs (default: 800): quiet period (no new fetch/xhr) before we consider capture “settled”.
quietTimeoutMs (default: 15000): hard timeout for settling. Settling waits for the first captured request. This prevents returning “quiet” too early when a page triggers fetch/xhr slightly later.

Page interaction flow

steps (default: []): page interaction flow (see below).
continueOnError (default: true): if a step fails, log a warning and continue.

Filtering (optional)

allowDomains: only capture requests to these domains (if set).
denyDomains: ignore requests to these domains.
denyUrlRegex: regex patterns to ignore requests.

Safety / privacy

redactHeaders: request/response headers to redact (defaults include auth/cookies).

Flow steps (steps)

Supported step types:

wait
waitForSelector
click
type
scroll

Example:

{
  "steps": [
    { "type": "wait", "ms": 1200 },
    { "type": "scroll", "to": "bottom", "times": 1, "pauseMs": 700 },
    { "type": "click", "selector": "button:has-text(\"Load more\")" },
    { "type": "waitForSelector", "selector": "main", "timeoutMs": 5000 }
  ]
}

Example inputs

Apify website (crawl a few pages)

{
  "startUrls": [{ "url": "https://apify.com/" }],
  "enqueueLinks": true,
  "maxRequests": 12,
  "strategy": "same-hostname",
  "globs": ["https://apify.com/**"],
  "linkSelector": "a[href]",

  "includeResponseBodies": false,
  "maxExchangesPerPage": 300,

  "quietMs": 800,
  "quietTimeoutMs": 20000,

  "steps": [{ "type": "wait", "ms": 1200 }]
}

GraphQL demo (Catstronauts)

{
  "startUrls": [{ "url": "https://catstronauts.netlify.app/" }],
  "enqueueLinks": false,
  "maxRequests": 1,

  "includeResponseBodies": true,
  "maxBodyKb": 128,
  "maxExchangesPerPage": 300,

  "quietMs": 800,
  "quietTimeoutMs": 20000,

  "steps": [{ "type": "wait", "ms": 1800 }]
}

Scoring (how candidates are ranked)

Each captured exchange gets a numeric score. Exchanges are grouped into endpoint candidates, and the highest-scoring exchange becomes the representative example for that candidate. Scoring rules (current)

Noise filter: if URL looks like analytics/telemetry → score = -1000.
+30 if response content-type includes json.
+10 if request hints include pagination keywords:
- cursor | offset | limit | page | perpage | nexttoken
- (checked across URL.search and request body text)
+10 if response size is known and content-length > 20k.
-50 if HTTP status is >= 400.
-30 if path looks like auth/session/token/csrf:
- /auth | /session | /csrf | /token
+15 if parsed JSON response looks list-like:
- an array of objects: [{...}, {...}]
- or an object with items: [...]

Notes

If includeResponseBodies is disabled, the “list-like response” boost cannot apply.

Get Cookie & Network Traffic

caring_scenario/cookie-getter

Fetch cookies and website traffic for analysis. Create a list of websites where you want to fetch your cookies and network traffic from to analyze. Run the actor *locally* on your browser with debug to get your own data, or run in the cloud to get the format/new incognito cookies and network.

I G

Website API and Endpoint Analyzer

lofomachines/website-api-and-endpoint-analyzer

Analyze one or more page URLs and output one dataset row per detected API or endpoint with network metadata and risk signals.

Lofomachines

api-gw-lite

amernas/api-gw-lite

This actor serves as a proxy that accepts requests with custom field names and translates them to standard HTTP requests to target APIs. It's designed to work seamlessly on the Apify platform while also supporting local testing

Traffic Architect

Website Traffic Machine

bhansalisoft/website-traffic-machine

Website Traffic Machine is unique usefull tool that is useful increase Website traffic directly using proxy ips and also search engine based traffic using keyword

bhansalisoft

Web Traffic Boots

hung.ad4gate/web-traffic-boots

Generate realistic web traffic for Google Analytics (GA) with sophisticated bot detection avoidance and human-like behavior simulation capabilities. traffic generator auto traffic generate website free traffic web traffic bot unlimited website traffic software organic traffic generator tool

Hung Dinh

283

5.0

Similarweb Scraper

igview-owner/similarweb-scraper

Scrape website traffic analytics for any domain: global rank, visits, engagement metrics, traffic sources, top SEO keywords, and similar competitor sites.

Sachin Kumar Yadav

102

3.0

Api Monitoring API

vivid_astronaut/api-monitoring

Fabio Suizu

Similar Finder

tomba-io/similar-finder

Find similar domains based on a specific domain using the Tomba API.

Tomba io

Vinted Similar Products

pintostudio/vinted-similar-products

The Vinted Similar Products Actor is an Apify actor designed to scrape similar product listings from Vinted marketplace.

Pinto Studio

TypeScript Web Crawling Actor Starter

ellustar/my-actor-35

TypeScript Web Crawling Actor Starter** is a ready-to-use template for building fast, scalable web crawlers with Crawlee and Cheerio. It includes clean TypeScript setup, best practices, error handling, and structured data extraction to help you launch quickly.