YouTube Related Videos Graph Crawler avatar

YouTube Related Videos Graph Crawler

Pricing

from $2.50 / 1,000 related video rows

Go to Apify Store
YouTube Related Videos Graph Crawler

YouTube Related Videos Graph Crawler

Crawl YouTube's related-videos graph multi-hop deep from any seed video. BFS traversal with configurable depth + branch + node cap. Each row carries depth, parent video, and full discovery path. Built for content strategy, brand safety, and recommendation-algorithm research.

Pricing

from $2.50 / 1,000 related video rows

Rating

0.0

(0)

Developer

SIÁN OÜ

SIÁN OÜ

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

YouTube Related Videos Graph Crawler — Multi-Hop BFS 🔁🚀

Store-SIÁN Agency Store-YouTube Auto Complete Store-YouTube Comments Scraper Store-YouTube AI Comments

Perfect for content-strategy teams, brand-safety auditors, YouTube SEO researchers, ML training-set builders & recommendation-algorithm studies.


📋 Overview

Map the topical neighborhood YouTube's algorithm associates with any video. Every other related-video scraper hands you a flat list from one API call — this one walks the graph BFS, up to 4 hops out, with global dedup and configurable branch factor. Each row carries depth, discoveredVia (parent videoId), and discoveryPath (full hop chain back to seed) — ready for NetworkX, Gephi, BigQuery, or any graph visualization tool.

Why content strategists, ML teams & researchers choose us:

  • Only multi-hop graph crawler on Apify: confirmed via marketplace scan — no other actor walks YouTube's related-videos graph beyond a single call.
  • Bulk seed support: up to 50 seed videoIds per run with global dedup — overlapping neighborhoods produce a denser cluster, not duplicates.
  • 🎯 Per-row provenance: every discovered videoId carries its depth, discoveredVia (parent), and full discoveryPath array — build edges in two lines of NetworkX.
  • 💰 BFS efficiency wedge: $0.005 per unique videoId at BRONZE — at maxDepth=2, branch=10 you map ~100 videos for under $0.55.
  • 💎 Safety-first design: hard maxNodes cap (1–5000) prevents quota burn; geometric explosion stopped mid-crawl when threshold hits.
  • NEW: discoveryPath array on every row — pivot it in Excel or feed into nx.add_edges_from(zip(path, path[1:])) to materialise the full graph instantly.

✨ Features

  • 🌳 Multi-Hop BFS Traversal: up to 4 hops deep from any seed videoId — the entire topical neighborhood in one run.
  • 📚 Bulk Seed Mode: up to 50 seeds per run, with global dedup across all branches.
  • 🛤 Per-Row Provenance: depth, discoveredVia (parent), discoveryPath (full hop chain) — graph-ready out of the box.
  • 🎛 Configurable Branch Factor: maxBranchPerNode from 1 (chain traversal) to 20 (wide breadth).
  • 🛡️ Safety Cap: maxNodes (1–5000) stops runaway crawls before they burn quota.
  • 🎥 Shorts & Playlist Passthrough: optional toggles to emit shorts_listing and playlist items as leaf rows.
  • 📊 Parsed View Counts: viewCount integer + viewCountText raw — no more parsing "9.6M views" yourself.
  • 📦 Schema-Stable Output: every row matches a published dataset schema — predictable for downstream ETL.
  • 🛡️ No Account, No API Key, No Proxies: just an Apify token and you're crawling.

🎬 Quick Start

Pass a seed videoId, pick a depth, and run. Each unique discovered videoId becomes one dataset row, stamped with depth and discovery path.

curl -X POST "https://api.apify.com/v2/acts/sian.agency~youtube-related-videos-crawler/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"videoId":"dQw4w9WgXcQ","maxDepth":2,"maxBranchPerNode":10,"maxNodes":120}'

🚀 Getting Started (3 Simple Steps)

Step 1: Pick your seed(s)

  • Single-seed mode: paste a video ID or any YouTube URL into videoId.
  • Bulk-seed mode: paste up to 50 video IDs (one per line) into seedVideoIds.

Step 2: Tune the crawl

  • maxDepth (0–4) — how many hops out (0 = seed only, 1 = direct neighbors, 2 = ~100 videos, 3 = ~500).
  • maxBranchPerNode (1–20) — width per node. 10 is the balanced default.
  • maxNodes (1–5000) — hard cap on total unique videos. 500 is the safe default.

Step 3: Run it

Hit run. Watch the dataset fill with one row per unique discovered videoId. Filter by depth if you only want close neighbors.

That's it! In seconds, you'll have:

  • A graph-ready dataset with depth, discoveredVia, and discoveryPath per row
  • Parsed view counts, channel info, thumbnails, and publish dates
  • HTML report with depth breakdown, success rate, and upstream call count

📥 Input Configuration

FieldTypeRequiredDescription
videoIdstringOne ofSingle seed video ID or YouTube URL
seedVideoIdsstringOne ofBulk seeds, one per line or comma-separated. Max 50
maxDepthintegerNoHops deep (0–4). Default 1. 0 = seed only
maxBranchPerNodeintegerNoRelated videos to follow per node (1–20). Default 10
maxNodesintegerNoHard cap on total unique videos (1–5000). Default 500
maxPagesPerNodeintegerNoContinuation pages per node (1–5). Default 1
includeShortsbooleanNoEmit shorts_listing items as leaf rows. Default true
includePlaylistsbooleanNoEmit playlist items as leaf rows. Default false
geostringNoISO 3166-1 alpha-2 country code. Defaults to US
langstringNoISO 639-1 language code. Defaults to en

Sizing intuition: rows ≈ 1 + branch + branch² + ... + branch^maxDepth, capped by maxNodes.

PresetmaxDepthmaxBranchPerNodemaxNodesExpected rows
Quick neighborhood110500~10
Topical cluster210200~100
Deep graph310500~500
Algorithm trace43200~120
Wide bulk study (5 seeds)25500~155

Example — topical-cluster study (~100 rows):

{
"videoId": "dQw4w9WgXcQ",
"maxDepth": 2,
"maxBranchPerNode": 10,
"maxNodes": 120
}

Example — bulk-seed graph fusion:

{
"seedVideoIds": "dQw4w9WgXcQ\n9bZkp7q19f0\nkJQP7kiw5Fk",
"maxDepth": 2,
"maxBranchPerNode": 5,
"maxNodes": 400
}

📤 Output

Results are saved to the Apify dataset with 20+ fields including:

FieldTypeDescription
depthintegerHops from seed. 0 = seed itself
discoveredViastringParent videoId from which this video was reached
discoveryPatharrayOrdered videoIds from seed → this row
itemTypestringvideo, shorts, playlist, or seed
videoIdstring11-character YouTube video ID
videoPageUrlstringCanonical watch URL
videoTitlestringTitle of the video / Short / playlist
channelIdstringCanonical channel ID (UC…)
channelTitlestringChannel display name
channelHandlestringChannel handle (e.g. @RickAstleyYT) when available
channelPageUrlstringCanonical channel page URL
viewCountintegerParsed view count (e.g. 9600000 from "9.6M views")
viewCountTextstringRaw view-count text from upstream
lengthTextstringDuration string (e.g. 3:32)
publishedAtstringISO 8601 published timestamp
thumbnailUrlstringHighest-resolution thumbnail URL
statusstringsuccess, error, or no_related_videos

Example row (depth=1 discovered video):

{
"_fetchedAt": "2026-05-22T08:42:11.013Z",
"_seedVideoId": "dQw4w9WgXcQ",
"status": "success",
"itemType": "video",
"depth": 1,
"discoveredVia": "dQw4w9WgXcQ",
"discoveryPath": ["dQw4w9WgXcQ", "yPYZpwSpKmA"],
"videoId": "yPYZpwSpKmA",
"videoPageUrl": "https://www.youtube.com/watch?v=yPYZpwSpKmA",
"videoTitle": "Rick Astley - Together Forever (Official Video) [4K Remaster]",
"channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
"channelTitle": "Rick Astley",
"channelHandle": "@RickAstleyYT",
"viewCount": 28000000,
"viewCountText": "28M views",
"lengthText": "3:32",
"publishedAt": "2025-05-21T00:00:00Z",
"thumbnailUrl": "https://i.ytimg.com/vi/yPYZpwSpKmA/hqdefault.jpg"
}

The seed itself is emitted as depth=0 with itemType: "seed" (not charged). Errors come back as rows with status: "error". Empty seeds get status: "no_related_videos" (no charge).


💼 Use Cases & Examples

1. Content Strategy — Topical Neighborhood Mapping

Content-strategy teams use this to map adjacent content ideas before they trend.

Input: A viral hit videoId at maxDepth=3, maxBranchPerNode=10. Output: ~500 related videos with channel, view counts, and discovery path. Use: Identify the topical cluster YouTube's algorithm associates with your seed — plan content series around it.

2. Brand Safety — Adjacency Audit

Brand-safety teams use this to flag off-brand or unsafe content sitting next to their ads.

Input: Each of your brand's YouTube ad videoIds, crawled at maxDepth=2. Output: Up to ~110 neighboring videos per ad with channel + title metadata. Use: Audit what content sits in the recommendation neighborhood — flag risky adjacencies in minutes.

3. YouTube SEO — Reverse-Engineer Algorithm Signals

YouTube SEO consultants use this to align their titles, tags, and descriptions with what the algorithm associates.

Input: Your channel's top-performing videoIds as seeds. Output: What the algorithm thinks each video relates to, across hops. Use: Optimize metadata to land in the discovered topical cluster — boost recommendation surfacing.

4. AI / ML — Topically-Clustered Training Sets

ML researchers use this to produce graph datasets with depth + parent metadata for embedding training.

Input: 50 seed videoIds at maxDepth=2, maxNodes=5000. Output: A graph dataset of up to 5,000 topically-related videos with edges. Use: Train recommendation models on real YouTube co-occurrence signals — ready for PyTorch Geometric / DGL.

5. Researchers — Recommendation-Algorithm Studies

Algorithm researchers use this to study echo chambers and graph fanout patterns.

Input: Polarising / political seeds vs entertainment seeds at maxDepth=3. Output: Side-by-side neighborhood graphs with full discoveryPath arrays. Use: Measure echo-chamber width, compare fanout, export to NetworkX for full network analysis.

6. Competitor Intelligence — Channel Adjacency Mapping

Competitor-intel teams use this to map which channels live in your category's algorithmic neighborhood.

Input: Top-performing video from each major competitor as seeds. Output: A merged graph showing channel overlap across competitor neighborhoods. Use: Spot which creators dominate the topical cluster — target them for partnerships or competitive analysis.

7. Topic-Cluster SEO — Pillar/Cluster Content Planning

Content marketers use this to map pillar/cluster topic architectures from a single hero video.

Input: Your pillar video at maxDepth=2, maxBranchPerNode=15. Output: Hundreds of cluster-topic candidates with engagement metrics. Use: Build a content calendar that maps onto the algorithm's existing topical graph — faster ranking.


🔗 Integration Examples

JavaScript / Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('sian.agency/youtube-related-videos-crawler').call({
videoId: 'dQw4w9WgXcQ',
maxDepth: 2,
maxBranchPerNode: 10,
maxNodes: 120,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Crawled ${items.length} videos across ${Math.max(...items.map(i => i.depth))} hops`);

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('sian.agency/youtube-related-videos-crawler').call(
run_input={
'videoId': 'dQw4w9WgXcQ',
'maxDepth': 2,
'maxBranchPerNode': 10,
'maxNodes': 120,
}
)
# Build a NetworkX graph from discoveryPath
import networkx as nx
G = nx.DiGraph()
for item in client.dataset(run['defaultDatasetId']).iterate_items():
path = item.get('discoveryPath') or []
G.add_edges_from(zip(path, path[1:]))
print(f'Graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges')

cURL

curl -X POST "https://api.apify.com/v2/acts/sian.agency~youtube-related-videos-crawler/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"videoId":"dQw4w9WgXcQ","maxDepth":2,"maxBranchPerNode":10,"maxNodes":120}'

Automation Workflows (N8N / Zapier / Make)

  1. Trigger: Schedule (e.g. weekly graph refresh) or manual webhook
  2. HTTP Request: Call this actor's run API with seed videoId(s)
  3. Process: Iterate dataset rows in your workflow, group by depth or discoveredVia
  4. Action: Push to BigQuery / Gephi / Notion / your graph database

📊 Performance & Pricing

FREE Tier (Try It Now)

  • Generous free credit covers a small initial neighborhood crawl — full feature access, same data quality.
  • No credit card required.
  • Perfect for kicking the tires on a single-seed maxDepth=1 crawl.
  • Unlimited unique videoIds discovered per run.
  • Pay-per-event: only billed for discovered videos at depth ≥ 1 — seed echoes, errors, and no_related_videos rows are free.
  • Pricing ladder scales from BRONZE to DIAMOND as your volume grows.

Pricing Highlights — related-video-row (headline):

  • BRONZE: $0.005 per unique discovered videoId
  • GOLD / PLATINUM / DIAMOND: $0.0025 per row

Run start — apify-actor-start:

  • BRONZE+ (paying tiers): $0.002 per run

💰 Use maxNodes as a soft cost cap — at BRONZE $0.005/row, maxNodes=200 ⇒ max $1 per run.

🔗 View live pricing


❓ Frequently Asked Questions

Q: What's the difference between this and a regular "related videos" scraper? A: Other actors give you one flat list from a single API call. This one walks the graph — each related video becomes a new starting point for its own related-videos call, up to maxDepth hops. You get the entire neighborhood, not just the front door.

Q: Why BFS and not DFS? A: Breadth-first matches "show me the topical neighborhood" intent. The algorithm signals weaken with depth, so BFS surfaces the strongest associations first. For chain-style traversal, set maxBranchPerNode=1 — BFS with branch=1 IS a depth-first chain.

Q: Why does my run return fewer rows than 1 + branch + branch²? A: Three reasons: (1) duplicates — videos discovered via multiple paths emit once; (2) the maxNodes cap kicks in mid-crawl; (3) some videos return fewer than maxBranchPerNode related items.

Q: Do you crawl Shorts and playlists? A: No. Shorts are emitted as leaf rows when includeShorts=true but never traversed. Playlists are emitted as leaf rows when includePlaylists=true — their cover videoId is captured but not crawled further (playlists are thumbnail anchors, not real graph nodes).

Q: What happens when a seed is deleted / private / region-blocked? A: The seed emits two rows: one itemType: "seed" (free) and one status: "error" with a friendly error message (free). Other seeds in a bulk run continue normally.

Q: Can I export to Gephi / NetworkX? A: Yes. Every row has discoveryPath: [seed, hop1, ..., this]. Build edges from consecutive pairs in each path: nx.add_edges_from(zip(path, path[1:])). The videoId is the node ID; videoTitle / channelTitle are node attributes.

Q: What output formats are available? A: JSON, CSV, Excel, RSS, HTML — export directly from the Apify dataset.

Q: Is this legal? A: Yes — we only extract publicly available data that YouTube already exposes to every viewer. See our legal section below.

YouTube is a trademark of Google LLC. This actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Google LLC.


🐛 Troubleshooting

"Invalid seed video ID" error

  • YouTube video IDs are exactly 11 characters of alphanumerics, dashes, and underscores. Paste the ID directly or any full YouTube URL — the actor will extract the ID for you.

Run returns only the seed row

  • Check maxDepth — if it's 0, only the seed is emitted by design. Bump to 1 to get direct neighbors.

Fewer rows than expected

  • Global dedup means videos discovered via multiple paths emit once. Also check whether maxNodes is capping the crawl earlier than maxDepth would.

Crawl runs longer than expected

  • Bigger maxDepth × maxBranchPerNode = geometric growth. Use maxNodes to cap total cost and runtime. At maxDepth=2, branch=10, maxNodes=200 you're typically under a minute.

Empty discoveryPath arrays

  • Only the seed row (itemType: "seed") has a single-element path. All discovered rows have at least [seed, child].

Our actors are ethical and do not extract any private user data, such as email addresses, gender, or location. They only extract what the user has chosen to share publicly. We therefore believe that our actors, when used for ethical purposes by Apify users, are safe.

However, you should be aware that your results could contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.

You can also read Apify's blog post on the legality of web scraping.


🤝 Support

Telegram Support

Join our active support community


Built by SIÁN Agency | More Tools