YouTube Related Videos Graph Crawler
Pricing
from $2.50 / 1,000 related video rows
YouTube Related Videos Graph Crawler
Crawl YouTube's related-videos graph multi-hop deep from any seed video. BFS traversal with configurable depth + branch + node cap. Each row carries depth, parent video, and full discovery path. Built for content strategy, brand safety, and recommendation-algorithm research.
Pricing
from $2.50 / 1,000 related video rows
Rating
0.0
(0)
Developer
SIÁN OÜ
Maintained by CommunityActor stats
1
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
YouTube Related Videos Graph Crawler — Multi-Hop BFS 🔁🚀
🎉 The ONLY actor that crawls YouTube's related-videos algorithm multi-hop deep — up to 4 hops
Perfect for content-strategy teams, brand-safety auditors, YouTube SEO researchers, ML training-set builders & recommendation-algorithm studies.
📋 Overview
Map the topical neighborhood YouTube's algorithm associates with any video. Every other related-video scraper hands you a flat list from one API call — this one walks the graph BFS, up to 4 hops out, with global dedup and configurable branch factor. Each row carries depth, discoveredVia (parent videoId), and discoveryPath (full hop chain back to seed) — ready for NetworkX, Gephi, BigQuery, or any graph visualization tool.
Why content strategists, ML teams & researchers choose us:
- ✅ Only multi-hop graph crawler on Apify: confirmed via marketplace scan — no other actor walks YouTube's related-videos graph beyond a single call.
- ⚡ Bulk seed support: up to 50 seed videoIds per run with global dedup — overlapping neighborhoods produce a denser cluster, not duplicates.
- 🎯 Per-row provenance: every discovered videoId carries its
depth,discoveredVia(parent), and fulldiscoveryPatharray — build edges in two lines of NetworkX. - 💰 BFS efficiency wedge: $0.005 per unique videoId at BRONZE — at
maxDepth=2,branch=10you map ~100 videos for under $0.55. - 💎 Safety-first design: hard
maxNodescap (1–5000) prevents quota burn; geometric explosion stopped mid-crawl when threshold hits. - ✨ NEW:
discoveryPatharray on every row — pivot it in Excel or feed intonx.add_edges_from(zip(path, path[1:]))to materialise the full graph instantly.
✨ Features
- 🌳 Multi-Hop BFS Traversal: up to 4 hops deep from any seed videoId — the entire topical neighborhood in one run.
- 📚 Bulk Seed Mode: up to 50 seeds per run, with global dedup across all branches.
- 🛤 Per-Row Provenance:
depth,discoveredVia(parent),discoveryPath(full hop chain) — graph-ready out of the box. - 🎛 Configurable Branch Factor:
maxBranchPerNodefrom 1 (chain traversal) to 20 (wide breadth). - 🛡️ Safety Cap:
maxNodes(1–5000) stops runaway crawls before they burn quota. - 🎥 Shorts & Playlist Passthrough: optional toggles to emit
shorts_listingandplaylistitems as leaf rows. - 📊 Parsed View Counts:
viewCountinteger +viewCountTextraw — no more parsing "9.6M views" yourself. - 📦 Schema-Stable Output: every row matches a published dataset schema — predictable for downstream ETL.
- 🛡️ No Account, No API Key, No Proxies: just an Apify token and you're crawling.
🎬 Quick Start
Pass a seed videoId, pick a depth, and run. Each unique discovered videoId becomes one dataset row, stamped with depth and discovery path.
curl -X POST "https://api.apify.com/v2/acts/sian.agency~youtube-related-videos-crawler/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"videoId":"dQw4w9WgXcQ","maxDepth":2,"maxBranchPerNode":10,"maxNodes":120}'
🚀 Getting Started (3 Simple Steps)
Step 1: Pick your seed(s)
- Single-seed mode: paste a video ID or any YouTube URL into
videoId. - Bulk-seed mode: paste up to 50 video IDs (one per line) into
seedVideoIds.
Step 2: Tune the crawl
maxDepth(0–4) — how many hops out (0 = seed only, 1 = direct neighbors, 2 = ~100 videos, 3 = ~500).maxBranchPerNode(1–20) — width per node. 10 is the balanced default.maxNodes(1–5000) — hard cap on total unique videos. 500 is the safe default.
Step 3: Run it
Hit run. Watch the dataset fill with one row per unique discovered videoId. Filter by depth if you only want close neighbors.
That's it! In seconds, you'll have:
- A graph-ready dataset with
depth,discoveredVia, anddiscoveryPathper row - Parsed view counts, channel info, thumbnails, and publish dates
- HTML report with depth breakdown, success rate, and upstream call count
📥 Input Configuration
| Field | Type | Required | Description |
|---|---|---|---|
videoId | string | One of | Single seed video ID or YouTube URL |
seedVideoIds | string | One of | Bulk seeds, one per line or comma-separated. Max 50 |
maxDepth | integer | No | Hops deep (0–4). Default 1. 0 = seed only |
maxBranchPerNode | integer | No | Related videos to follow per node (1–20). Default 10 |
maxNodes | integer | No | Hard cap on total unique videos (1–5000). Default 500 |
maxPagesPerNode | integer | No | Continuation pages per node (1–5). Default 1 |
includeShorts | boolean | No | Emit shorts_listing items as leaf rows. Default true |
includePlaylists | boolean | No | Emit playlist items as leaf rows. Default false |
geo | string | No | ISO 3166-1 alpha-2 country code. Defaults to US |
lang | string | No | ISO 639-1 language code. Defaults to en |
Sizing intuition: rows ≈ 1 + branch + branch² + ... + branch^maxDepth, capped by maxNodes.
| Preset | maxDepth | maxBranchPerNode | maxNodes | Expected rows |
|---|---|---|---|---|
| Quick neighborhood | 1 | 10 | 500 | ~10 |
| Topical cluster | 2 | 10 | 200 | ~100 |
| Deep graph | 3 | 10 | 500 | ~500 |
| Algorithm trace | 4 | 3 | 200 | ~120 |
| Wide bulk study (5 seeds) | 2 | 5 | 500 | ~155 |
Example — topical-cluster study (~100 rows):
{"videoId": "dQw4w9WgXcQ","maxDepth": 2,"maxBranchPerNode": 10,"maxNodes": 120}
Example — bulk-seed graph fusion:
{"seedVideoIds": "dQw4w9WgXcQ\n9bZkp7q19f0\nkJQP7kiw5Fk","maxDepth": 2,"maxBranchPerNode": 5,"maxNodes": 400}
📤 Output
Results are saved to the Apify dataset with 20+ fields including:
| Field | Type | Description |
|---|---|---|
depth | integer | Hops from seed. 0 = seed itself |
discoveredVia | string | Parent videoId from which this video was reached |
discoveryPath | array | Ordered videoIds from seed → this row |
itemType | string | video, shorts, playlist, or seed |
videoId | string | 11-character YouTube video ID |
videoPageUrl | string | Canonical watch URL |
videoTitle | string | Title of the video / Short / playlist |
channelId | string | Canonical channel ID (UC…) |
channelTitle | string | Channel display name |
channelHandle | string | Channel handle (e.g. @RickAstleyYT) when available |
channelPageUrl | string | Canonical channel page URL |
viewCount | integer | Parsed view count (e.g. 9600000 from "9.6M views") |
viewCountText | string | Raw view-count text from upstream |
lengthText | string | Duration string (e.g. 3:32) |
publishedAt | string | ISO 8601 published timestamp |
thumbnailUrl | string | Highest-resolution thumbnail URL |
status | string | success, error, or no_related_videos |
Example row (depth=1 discovered video):
{"_fetchedAt": "2026-05-22T08:42:11.013Z","_seedVideoId": "dQw4w9WgXcQ","status": "success","itemType": "video","depth": 1,"discoveredVia": "dQw4w9WgXcQ","discoveryPath": ["dQw4w9WgXcQ", "yPYZpwSpKmA"],"videoId": "yPYZpwSpKmA","videoPageUrl": "https://www.youtube.com/watch?v=yPYZpwSpKmA","videoTitle": "Rick Astley - Together Forever (Official Video) [4K Remaster]","channelId": "UCuAXFkgsw1L7xaCfnd5JJOw","channelTitle": "Rick Astley","channelHandle": "@RickAstleyYT","viewCount": 28000000,"viewCountText": "28M views","lengthText": "3:32","publishedAt": "2025-05-21T00:00:00Z","thumbnailUrl": "https://i.ytimg.com/vi/yPYZpwSpKmA/hqdefault.jpg"}
The seed itself is emitted as depth=0 with itemType: "seed" (not charged). Errors come back as rows with status: "error". Empty seeds get status: "no_related_videos" (no charge).
💼 Use Cases & Examples
1. Content Strategy — Topical Neighborhood Mapping
Content-strategy teams use this to map adjacent content ideas before they trend.
Input: A viral hit videoId at maxDepth=3, maxBranchPerNode=10.
Output: ~500 related videos with channel, view counts, and discovery path.
Use: Identify the topical cluster YouTube's algorithm associates with your seed — plan content series around it.
2. Brand Safety — Adjacency Audit
Brand-safety teams use this to flag off-brand or unsafe content sitting next to their ads.
Input: Each of your brand's YouTube ad videoIds, crawled at maxDepth=2.
Output: Up to ~110 neighboring videos per ad with channel + title metadata.
Use: Audit what content sits in the recommendation neighborhood — flag risky adjacencies in minutes.
3. YouTube SEO — Reverse-Engineer Algorithm Signals
YouTube SEO consultants use this to align their titles, tags, and descriptions with what the algorithm associates.
Input: Your channel's top-performing videoIds as seeds. Output: What the algorithm thinks each video relates to, across hops. Use: Optimize metadata to land in the discovered topical cluster — boost recommendation surfacing.
4. AI / ML — Topically-Clustered Training Sets
ML researchers use this to produce graph datasets with depth + parent metadata for embedding training.
Input: 50 seed videoIds at maxDepth=2, maxNodes=5000.
Output: A graph dataset of up to 5,000 topically-related videos with edges.
Use: Train recommendation models on real YouTube co-occurrence signals — ready for PyTorch Geometric / DGL.
5. Researchers — Recommendation-Algorithm Studies
Algorithm researchers use this to study echo chambers and graph fanout patterns.
Input: Polarising / political seeds vs entertainment seeds at maxDepth=3.
Output: Side-by-side neighborhood graphs with full discoveryPath arrays.
Use: Measure echo-chamber width, compare fanout, export to NetworkX for full network analysis.
6. Competitor Intelligence — Channel Adjacency Mapping
Competitor-intel teams use this to map which channels live in your category's algorithmic neighborhood.
Input: Top-performing video from each major competitor as seeds. Output: A merged graph showing channel overlap across competitor neighborhoods. Use: Spot which creators dominate the topical cluster — target them for partnerships or competitive analysis.
7. Topic-Cluster SEO — Pillar/Cluster Content Planning
Content marketers use this to map pillar/cluster topic architectures from a single hero video.
Input: Your pillar video at maxDepth=2, maxBranchPerNode=15.
Output: Hundreds of cluster-topic candidates with engagement metrics.
Use: Build a content calendar that maps onto the algorithm's existing topical graph — faster ranking.
🔗 Integration Examples
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('sian.agency/youtube-related-videos-crawler').call({videoId: 'dQw4w9WgXcQ',maxDepth: 2,maxBranchPerNode: 10,maxNodes: 120,});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Crawled ${items.length} videos across ${Math.max(...items.map(i => i.depth))} hops`);
Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_TOKEN')run = client.actor('sian.agency/youtube-related-videos-crawler').call(run_input={'videoId': 'dQw4w9WgXcQ','maxDepth': 2,'maxBranchPerNode': 10,'maxNodes': 120,})# Build a NetworkX graph from discoveryPathimport networkx as nxG = nx.DiGraph()for item in client.dataset(run['defaultDatasetId']).iterate_items():path = item.get('discoveryPath') or []G.add_edges_from(zip(path, path[1:]))print(f'Graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges')
cURL
curl -X POST "https://api.apify.com/v2/acts/sian.agency~youtube-related-videos-crawler/runs?token=YOUR_TOKEN" \-H "Content-Type: application/json" \-d '{"videoId":"dQw4w9WgXcQ","maxDepth":2,"maxBranchPerNode":10,"maxNodes":120}'
Automation Workflows (N8N / Zapier / Make)
- Trigger: Schedule (e.g. weekly graph refresh) or manual webhook
- HTTP Request: Call this actor's run API with seed videoId(s)
- Process: Iterate dataset rows in your workflow, group by
depthordiscoveredVia - Action: Push to BigQuery / Gephi / Notion / your graph database
📊 Performance & Pricing
FREE Tier (Try It Now)
- Generous free credit covers a small initial neighborhood crawl — full feature access, same data quality.
- No credit card required.
- Perfect for kicking the tires on a single-seed
maxDepth=1crawl.
PAID Tier (Production-Ready)
- Unlimited unique videoIds discovered per run.
- Pay-per-event: only billed for discovered videos at depth ≥ 1 — seed echoes, errors, and
no_related_videosrows are free. - Pricing ladder scales from BRONZE to DIAMOND as your volume grows.
Pricing Highlights — related-video-row (headline):
- BRONZE:
$0.005per unique discovered videoId - GOLD / PLATINUM / DIAMOND:
$0.0025per row
Run start — apify-actor-start:
- BRONZE+ (paying tiers):
$0.002per run
💰 Use maxNodes as a soft cost cap — at BRONZE $0.005/row, maxNodes=200 ⇒ max $1 per run.
❓ Frequently Asked Questions
Q: What's the difference between this and a regular "related videos" scraper?
A: Other actors give you one flat list from a single API call. This one walks the graph — each related video becomes a new starting point for its own related-videos call, up to maxDepth hops. You get the entire neighborhood, not just the front door.
Q: Why BFS and not DFS?
A: Breadth-first matches "show me the topical neighborhood" intent. The algorithm signals weaken with depth, so BFS surfaces the strongest associations first. For chain-style traversal, set maxBranchPerNode=1 — BFS with branch=1 IS a depth-first chain.
Q: Why does my run return fewer rows than 1 + branch + branch²?
A: Three reasons: (1) duplicates — videos discovered via multiple paths emit once; (2) the maxNodes cap kicks in mid-crawl; (3) some videos return fewer than maxBranchPerNode related items.
Q: Do you crawl Shorts and playlists?
A: No. Shorts are emitted as leaf rows when includeShorts=true but never traversed. Playlists are emitted as leaf rows when includePlaylists=true — their cover videoId is captured but not crawled further (playlists are thumbnail anchors, not real graph nodes).
Q: What happens when a seed is deleted / private / region-blocked?
A: The seed emits two rows: one itemType: "seed" (free) and one status: "error" with a friendly error message (free). Other seeds in a bulk run continue normally.
Q: Can I export to Gephi / NetworkX?
A: Yes. Every row has discoveryPath: [seed, hop1, ..., this]. Build edges from consecutive pairs in each path: nx.add_edges_from(zip(path, path[1:])). The videoId is the node ID; videoTitle / channelTitle are node attributes.
Q: What output formats are available? A: JSON, CSV, Excel, RSS, HTML — export directly from the Apify dataset.
Q: Is this legal? A: Yes — we only extract publicly available data that YouTube already exposes to every viewer. See our legal section below.
YouTube is a trademark of Google LLC. This actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Google LLC.
🐛 Troubleshooting
"Invalid seed video ID" error
- YouTube video IDs are exactly 11 characters of alphanumerics, dashes, and underscores. Paste the ID directly or any full YouTube URL — the actor will extract the ID for you.
Run returns only the seed row
- Check
maxDepth— if it's 0, only the seed is emitted by design. Bump to 1 to get direct neighbors.
Fewer rows than expected
- Global dedup means videos discovered via multiple paths emit once. Also check whether
maxNodesis capping the crawl earlier thanmaxDepthwould.
Crawl runs longer than expected
- Bigger
maxDepth×maxBranchPerNode= geometric growth. UsemaxNodesto cap total cost and runtime. AtmaxDepth=2, branch=10, maxNodes=200you're typically under a minute.
Empty discoveryPath arrays
- Only the seed row (
itemType: "seed") has a single-element path. All discovered rows have at least[seed, child].
⚖️ Is it legal to scrape data?
Our actors are ethical and do not extract any private user data, such as email addresses, gender, or location. They only extract what the user has chosen to share publicly. We therefore believe that our actors, when used for ethical purposes by Apify users, are safe.
However, you should be aware that your results could contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.
You can also read Apify's blog post on the legality of web scraping.
🤝 Support
Join our active support community
- For issues or questions, open an issue in the actor's repository
- Check SIÁN Agency Store for more automation tools
- ✉️ apify@sian-agency.online
Built by SIÁN Agency | More Tools