Graph · Vibe Scraping
Pricing
from $1.00 / 1,000 results
Graph · Vibe Scraping
Map any website's structure. Our AI analyzes the site and builds a crawler that extracts page relationships, perfect for understanding architecture before running extraction jobs.
Graph · Vibe Scraping
Map any website's structure
Part of our Vibe Scraping series:
AI-built crawlers for site structure analysis.
⚠️ Beta Release
This actor is currently in beta. We're actively improving based on feedback. Report issues on the issues tab.
🔥 Why This Actor
Understand any website before you scrape it. Map pages, links, and structure.
- Site structure mapping: Discover how pages connect — home, categories, listings, items
- Any website: Works on most sites out of the box
- Visual output: Export graph data for visualization tools like Cosmograph
- Fast & cheap: Custom Rust engine — no AI during extraction, orders of magnitude faster than LLM-based scraping
✨ Use Cases
- Pre-scrape Discovery: Understand site structure before running extraction jobs
- Category Analysis: Identify high-value sections by analyzing inbound link counts
- URL Discovery: Find all item pages, then feed them into structured extraction actors
- Site Audits: Map internal linking for SEO or content analysis
💡 About Extralt
We're rethinking web scraping. Our crawlers are generated by AI but run as compiled code — giving you enterprise-scale performance without the brittleness of traditional scrapers or the cost of pure AI solutions.
🕸️ Graph
The crawler saves the graph as a JSON file, as well as an html file that allows visualizing and interacting with the graph directly in Apify (see Outputs section).
Here is an example of the graph generated by the crawler for nike.com (US site), where we see the homepage at the center, then a landing page, then a category page, then a product page:

This visualization of the graph was generated using Cosmograph using the JSON files.
🪙 Pricing
This actor uses a pay-per-event pricing model — you only pay for successfully extracted pages:
| Subscription | Discount | Price per dataset item |
|---|---|---|
| Starter | Bronze | $0.002 |
| Scale | Silver | $0.0015 |
| Business | Gold | $0.001 |
Example: Extracting 1,000 pages on a Business plan costs: 1,000 × $0.001 = $1.00
All-inclusive pricing: We only use premium residential proxies, with no hidden costs or add-ons.
Why paid plans only? Apify excludes free plan users from revenue calculations (see docs), so we restrict this actor to paying customers only.
Concurrent runs: You can run up to 3 Extralt actors simultaneously. If you need more concurrent runs, please wait for one to finish before starting a new one. This number will increase as we scale up our infrastructure.
⬇️ Input
| Parameter | Required | Description |
|---|---|---|
| Start URLs | Yes | One or more URLs to begin crawling |
| Country | Yes | Proxy location for regional content |
| Budget | No | Maximum number of pages to extract |
Start URLs
The crawler explores the site structure:
- Any page: Parses sitemap and follows internal links to map page relationships
Constraints:
- All URLs must be from the same host
- URLs should match your target country (e.g.,
example.frorexample.com/frfor France)
⬆️ Output
1. Dataset
Each extracted item contains metadata and extracted data:
| Field | Description |
|---|---|
extracted_at | Unix timestamp of extraction |
url | Page URL |
page_kind | Page type: home, list, item, or other |
title | Page title |
data | Always null (included for schema consistency) |
outbound_links | Links found on the page |
Use the Overview view in Apify to browse results as a formatted table, or download in JSON, CSV, HTML, or Excel.
Page Kind
The crawler classifies each page into one of four kinds:
| Page Kind | Description | Example |
|---|---|---|
home | Main entry point of the site | https://www.nike.com |
list | Category or collection pages listing multiple items | https://www.nike.com/w/womens-running-shoes-37v7jz5e1x6zy7ok |
item | Pages containing 1 main item | https://www.nike.com/t/vomero-plus-womens-road-running-shoes-8AH6updi/HV8154-501 |
other | Landing, navigation, help, or non-content pages | https://www.nike.com/women |
Data
The data field is always null for this actor. We include it for schema consistency across our Vibe Scraping series, making it easy to combine outputs from multiple actors in a single pipeline.
Outbound Links
The crawler extracts different links depending on the page type:
| Page Kind | Links Extracted |
|---|---|
home | Header navigation links only |
list | Item pages in the catalog (ignores navigation, header, footer) |
item | Related / recommended listings |
other | All content links (excludes header/footer navigation) |
Example Output
Results from crawling https://www.nike.com:
[{"extracted_at": 1769653129596,"url": "https://www.nike.com/w/nikeskims-shoes-b2asdzy7ok","page_kind": "list","title": "NikeSKIMS Shoes. Nike.com","data": null,"outbound_links": ["https://www.nike.com/t/nikeskims-rift-mesh-womens-shoes-nEhvuNqj/IO7694-001","https://www.nike.com/t/nikeskims-rift-satin-womens-shoes-uxrT1kE0/IQ7158-600"]},{"extracted_at": 1769653130192,"url": "https://www.nike.com/t/shox-tl-womens-shoes-TH65kqnj/AR3566-002","page_kind": "item","title": "Nike Shox TL Women's Shoes. Nike.com","data": null,"outbound_links": ["https://www.nike.com/t/shox-tl-fade-womens-shoes-TH65kqnj/IH1336-601","https://www.nike.com/t/shox-tl-womens-shoes-TH65kqnj/IO1912-060","https://www.nike.com/t/shox-tl-fade-womens-shoes-TH65kqnj/IH1336-600","https://www.nike.com/t/shox-tl-womens-shoes-with-reflective-accents-TH65kqnj/IB1087-002","https://www.nike.com/t/shox-tl-womens-sheos-TH65kqnj/IQ5091-663"]}]
2. Key-Value Store
After all pages have been extracted, the crawler saves the graph as a JSON file in the key-value store, under the key graph.json.
We also save the nodes and edges as separate JSON files, under the keys nodes.json and edges.json.
Finally, we save the graph as an HTML file, under the key graph.html.

Click the eye icon, the graph visualization will open in a new tab. On the graph, you can click on a node to see more information about it, and you can search for a page by title, rearrange the nodes or change the colors. Feel free to experiment!
You can see an example of the graph generated for www.nike.com (US site) here
JSON Files
The graph contains:
node: page information withid: page IDurl: URLtitle: titlepage_kind: page type: home, list, item, other.out: number of outbound linksin: number of inbound links
edge: link between two pagesstats: statistics about the graph
Here is an example of the graph with 10 pages:
{"nodes": [{"id": 0,"url": "https://www.nike.com","title": "Nike. Just Do It. Nike.com","page_kind": "home","out": 6,"in": 0},{"id": 1,"url": "https://www.nike.com/w/golf-accessories-equipment-23q9wzawwpw","title": "Golf Accessories and Equipment. Nike.com","page_kind": "list","out": 0,"in": 1},{"id": 2,"url": "https://www.nike.com/w/little-kids-jordan-37eefz6dacezv4dh","title": "Little Kids Jordan. Nike.com","page_kind": "list","out": 0,"in": 1},{"id": 3,"url": "https://www.nike.com/w/nikeskims-dark-roast-40t7v","title": "NikeSKIMS Dark Roast. Nike.com","page_kind": "list","out": 0,"in": 1},{"id": 4,"url": "https://www.nike.com/w/womens-joggers-sweatpants-5e1x6zaepf0","title": "Womens Joggers & Sweatpants. Nike.com","page_kind": "list","out": 0,"in": 1},{"id": 5,"url": "https://www.nike.com/w/womens-jordan-training-gym-37eefz58jtoz5e1x6","title": "Womens Jordan Training & Gym. Nike.com","page_kind": "list","out": 0,"in": 1},{"id": 6,"url": "https://www.nike.com/w/womens-sports-bras-40qgmz5e1x6","title": "Women's Sports Bras. Nike.com","page_kind": "list","out": 3,"in": 1},{"id": 7,"url": "https://www.nike.com/t/zenvy-strappy-womens-light-support-padded-sports-bra-sXpuylYy/IB9847-010","title": "Nike Zenvy Strappy Women's Light-Support Padded Sports Bra. Nike.com","page_kind": "item","out": 1,"in": 2},{"id": 8,"url": "https://www.nike.com/t/zenvy-strappy-womens-light-support-padded-sports-bra-sXpuylYy/IB9847-503","title": "Nike Zenvy Strappy Women's Light-Support Padded Sports Bra. Nike.com","page_kind": "item","out": 1,"in": 2},{"id": 9,"url": "https://www.nike.com/t/zenvy-womens-light-support-sports-bra-tank-TiCxTuL6/IB9872-503","title": "Nike Zenvy Women's Light-Support Sports Bra Tank. Nike.com","page_kind": "item","out": 0,"in": 1}],"edges": [{"source": 8,"target": 7},{"source": 7,"target": 8},{"source": 6,"target": 8},{"source": 6,"target": 9},{"source": 6,"target": 7},{"source": 0,"target": 5},{"source": 0,"target": 2},{"source": 0,"target": 6},{"source": 0,"target": 1},{"source": 0,"target": 3},{"source": 0,"target": 4}],"stats": {"node_count": 10,"edge_count": 11,"other_count": 0,"list_count": 6,"item_count": 3}}
⚙️ Under the Hood
How It Works
- First run — AI analyzes the site and generates a custom crawler (3-5 minutes)
- Subsequent runs — Crawler is reused, extraction starts immediately
The crawler is regenerated when you change the website or country.
Why It's Fast
Unlike LLM-based scrapers that call AI for every page, we use AI once to generate a compiled Rust extractor. This means:
- No per-page AI costs — extraction runs as pure code
- High throughput — up to 50 pages/second (3,000/minute)
- Consistent results — same extractor, deterministic output
Infrastructure
Extraction runs on our dedicated infrastructure, not Apify's platform. There may be a brief delay (~15-20s) while provisioning resources before the crawl starts.
Stealth
Our Rust engine includes custom HTTP and browser implementations built specifically for web scraping:
- Smart request routing (Chrome rendering, fast HTTP, direct API calls)
- Anti-detection measures to avoid blocks
- Premium residential proxies included
🛠️ Troubleshooting
Extraction taking longer than expected?
- First run: AI is generating your custom crawler (3-5 minutes). Subsequent runs start immediately.
- Provisioning: Brief delay (~15-20s) while infrastructure spins up.
Getting blocked or no results?
- Verify the start URL is accessible in your browser
- Ensure the selected country matches the website's region
- Try a smaller budget to reduce request volume
- Some sites have aggressive bot protection — report persistent issues
🎙️ Feedback & Support
We're actively improving extraction quality based on your feedback.
- Bugs, questions, feature requests: Issues tab
