Youtube Scraper avatar

Youtube Scraper

Pricing

from $4.99 / 1,000 results

Go to Apify Store
Youtube Scraper

Youtube Scraper

🎥 YouTube Scraper extracts structured data from videos, channels & playlists — titles, tags, views, likes, comments, captions, thumbnails & publish dates. 🔎 Perfect for SEO, competitor analysis, research & reporting. 🚀 Export-ready for CSV/JSON pipelines.

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

Scraper Engine

Scraper Engine

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

5 days ago

Last modified

Share

Youtube Scraper

The Youtube Scraper is a production-ready Apify actor that extracts structured data from YouTube search results and direct video URLs — including titles, views, likes, comments count, subscriber counts, descriptions, hashtags, thumbnails, publish dates, and optional transcripts/subtitles. This YouTube scraper tool solves the challenge of collecting clean, export-ready YouTube video metadata at scale without the official API, making it ideal for marketers, developers, data analysts, and researchers. With robust anti-blocking, it enables reliable pipelines for YouTube competitor analysis scraper workflows, SEO tracking, and reporting.

What data / output can you get?

Below are the main fields pushed to the Apify dataset by the Youtube Scraper. These map directly to the actor’s output and are ready to export as CSV, JSON, or Excel.

Data typeDescriptionExample value
titleVideo title“How to use Crawlee in 10 minutes”
typeContent type (video or shorts)“video”
idYouTube video ID“dQw4w9WgXcQ”
urlCanonical URL (shorts/videos)https://www.youtube.com/watch?v=dQw4w9WgXcQ”
thumbnailUrlHigh-quality thumbnail URLhttps://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg”
viewCountParsed integer view count1500000
dateISO-like published date (when available)“2025-01-15T00:00:00.000Z”
likesTotal likes (when detected)50000
durationHH:MM:SS (or null if unknown)“00:03:33”
channelNameChannel display name“Channel Name”
channelUrlAbsolute channel URLhttps://www.youtube.com/@channelname”
numberOfSubscribersSubscriber count (when available)1000000
commentsCountTotal comments (when detected)12000
textDescription/snippet text“Video description…”
descriptionLinksURLs and hashtag links extracted from description[{"url":"https://example.com%22,%22text%22:%22https://example.com"}]
subtitlesAvailable subtitle language codes (when detected)["en","es"]
hashtagsHashtags from title/description["#example","#tutorial"]
fromYTUrlSource YouTube results/seed URLhttps://www.youtube.com/results?search_query=crawlee”
orderItem index in run0
isCreativeCommonsCreative Commons flag (best-effort)true
isPurchasedPurchased/paid flag (best-effort)false

Bonus (when subtitles are downloaded): transcript, transcriptLanguage, transcriptFormat. Additional flags include commentsTurnedOff, isMonetized (when present). All outputs are pushed via Actor.pushData for seamless exports to CSV/JSON/Excel.

Key features

  • 🛡️ Smart anti-blocking & proxy escalation — Automatically escalates from direct → Apify datacenter → Apify residential with retries, then sticks to a working level for the rest of the run.
  • 🧪 Realistic HTTP fingerprinting — Uses the impit HTTP client to impersonate modern browsers and bypass TLS/HTTP fingerprinting checks reliably.
  • Concurrent metadata enrichment — Batch-fetches video pages with controlled concurrency to enrich likes, commentsCount, numberOfSubscribers, subtitles, and more.
  • 🎯 Flexible search filters — Apply post-processing filters and sorting: dateFilter, videoTypeFilter, lengthFilter, sortingOrder, and sortBy for reliable ordering and selection.
  • 🎞️ Quality & format filters — Filter for isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased to build high-signal datasets.
  • 💬 Transcripts & subtitles — Toggle downloadSubtitles with subtitlesLanguage, subtitlesFormat (srt, text, timestamp), and preferAutoGenerated for broader coverage.
  • 🔎 YouTube search results scraper — Scrape videos from search terms at scale with pagination and safety limits.
  • 🔗 Direct video URL support — Provide a list of video URLs to extract complete metadata and optional transcripts.
  • 📦 Export-ready outputs — Structured fields for straightforward analytics, making it a robust YouTube data scraper for CSV/JSON pipelines.

How to use Youtube Scraper - step by step

  1. Create or log in to your Apify account at console.apify.com.
  2. Navigate to Actors and open “Youtube Scraper”.
  3. Add input:
    • searchTerms as a list of keywords, or
    • startUrls with direct video URLs.
  4. Configure limits and filters:
    • maxVideos, maxShorts, maxStreams per search term.
    • Quality/features (isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, isPurchased).
    • Sorting and post-filters (sortingOrder, dateFilter, videoTypeFilter, lengthFilter, sortBy).
  5. Subtitles & transcripts:
    • Enable downloadSubtitles, choose subtitlesLanguage and subtitlesFormat, and optionally preferAutoGenerated or saveSubtitlesToKvs.
  6. Proxy setup:
    • Leave default or set proxyConfiguration; auto-fallback is built in if blocks occur.
  7. Click Start to run. Monitor progress in the Log tab (you’ll see page counts, filters applied, and proxy changes).
  8. Access results in the Dataset tab and export to JSON, CSV, or Excel.

Pro tip: Use the Apify dataset to plug this YouTube web scraping tool into your reporting or BI stack as a YouTube data extractor for SEO dashboards and competitor tracking.

Use cases

Use case nameDescription
SEO teams — video metadata trackingTrack titles, views, likes, and publish dates to benchmark performance and optimize rankings using a YouTube video metadata scraper.
Competitor research — content analysisMonitor competitor uploads, extract hashtags and descriptions, and compare engagement for a YouTube competitor analysis scraper workflow.
Keyword discovery — search SERP miningUse searchTerms to collect top results for queries and build a YouTube keyword scraper dataset for content planning.
Research & NLP — transcript collectionEnable subtitles download to power NLP pipelines or topic modeling with a YouTube transcript scraper and subtitles extractor.
Reporting & BI — export-ready metricsExport structured fields to CSV/JSON/Excel for dashboards and periodic performance reporting using a YouTube data extractor.
Live/short-form monitoring — format filtersFilter by isLive or collect Shorts with caps (maxShorts, maxStreams) to build specialized watchlists.

Why choose Youtube Scraper?

The Youtube Scraper is built for precision, scale, and reliability on the Apify platform.

  • 🎯 Accurate metadata parsing from search + video pages (titles, views, likes, commentsCount, subscribers, hashtags).
  • 🧬 Multiformat transcripts (SRT, text, timestamp) with language selection and auto-generated fallback support.
  • 🚀 Scales with concurrency and robust pagination, ideal for batch YouTube data extraction.
  • 🧩 Developer-friendly outputs with consistent JSON fields for analytics and ETL workflows.
  • 🛡️ Safe, production-ready anti-blocking: automatic proxy escalation and browser impersonation via impit.
  • 💰 Export-ready for CSV/JSON pipelines — perfect for SEO, reporting, and research.
  • 🔄 More reliable than ad-hoc scripts or extensions, thanks to stable infrastructure and structured output.

In short: a dependable YouTube web scraping tool for teams that need consistent, structured video data at scale.

Yes — when done responsibly. The actor collects data from publicly available YouTube pages and does not access private or password-protected content.

Guidelines for responsible use:

  • Only use data from public pages.
  • Respect copyright and licensing (e.g., check Creative Commons details before reuse).
  • Comply with applicable regulations (e.g., GDPR, CCPA) and YouTube’s terms.
  • Consult your legal team for edge cases or sensitive applications.

Input parameters & output format

Example JSON input

{
"searchTerms": ["Crawlee", "data extraction"],
"maxVideos": 10,
"maxShorts": 0,
"maxStreams": 0,
"downloadSubtitles": true,
"saveSubtitlesToKvs": false,
"subtitlesLanguage": "en",
"preferAutoGenerated": false,
"subtitlesFormat": "srt",
"sortingOrder": "relevance",
"dateFilter": "",
"videoTypeFilter": "",
"lengthFilter": "",
"isHD": false,
"hasCC": false,
"isCreativeCommons": false,
"is3D": false,
"isLive": false,
"isPurchased": false,
"is4K": false,
"is360": false,
"hasLocation": false,
"isHDR": false,
"isVR180": false,
"publishedAfter": "",
"sortBy": "",
"proxyConfiguration": { "useApifyProxy": false },
"startUrls": []
}

Parameters (all optional; none are required):

  • searchTerms (array) — Enter one or more YouTube search keywords. Default: [].
  • maxVideos (integer) — Maximum regular videos per search term. Use 0 to skip. Default: 10.
  • maxShorts (integer) — Maximum Shorts per search term. Use 0 to skip. Default: 0.
  • maxStreams (integer) — Maximum live/upcoming streams per search term. Use 0 to skip. Default: 0.
  • startUrls (array) — Provide direct YouTube video, channel, playlist, or results page URLs to scrape without using search terms. Default: [].
  • downloadSubtitles (boolean) — Download subtitles/transcripts when available. Default: false.
  • saveSubtitlesToKvs (boolean) — Store each transcript in the key-value store under its own key. Default: false.
  • subtitlesLanguage (string) — Preferred language for subtitles/transcripts. Default: "en".
  • preferAutoGenerated (boolean) — Prefer auto-generated subtitles. Default: false.
  • subtitlesFormat (string) — "srt", "text", or "timestamp". Default: "srt".
  • sortingOrder (string) — Post-processing sort: "", "relevance", "date", "viewCount", "rating". Default: "".
  • dateFilter (string) — "", "hour", "today", "week", "month", "year". Default: "".
  • videoTypeFilter (string) — "", "video", "channel", "playlist", "movie". Default: "".
  • lengthFilter (string) — "", "short", "medium", "long". Default: "".
  • isHD (boolean) — Only include HD videos (>=720p). Default: false.
  • hasCC (boolean) — Require at least one non-auto CC track. Default: false.
  • isCreativeCommons (boolean) — Include only Creative Commons videos. Default: false.
  • is3D (boolean) — Include only 3D videos. Default: false.
  • isLive (boolean) — Restrict to live/live-style content. Default: false.
  • isPurchased (boolean) — Best-effort filter for purchased/paid content. Default: false.
  • is4K (boolean) — Include only 4K (2160p) videos. Default: false.
  • is360 (boolean) — Include only 360° videos. Default: false.
  • hasLocation (boolean) — Include only videos with explicit location metadata. Default: false.
  • isHDR (boolean) — Include only HDR videos. Default: false.
  • isVR180 (boolean) — Include only VR180 videos. Default: false.
  • publishedAfter (string) — Only include videos published after YYYY-MM-DD. Default: "".
  • sortBy (string) — Post-sort by "", "date", "viewCount", or "likes". Default: "".
  • proxyConfiguration (object) — Proxy settings; actor escalates if blocked. Default: {}.

Example JSON output

{
"title": "How to use Crawlee in 10 minutes",
"translatedTitle": null,
"type": "video",
"id": "dQw4w9WgXcQ",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg",
"viewCount": 1500000,
"date": "2025-01-15T00:00:00.000Z",
"likes": 50000,
"location": null,
"channelName": "Channel Name",
"channelUrl": "https://www.youtube.com/@channelname",
"channelUsername": "channelname",
"collaborators": null,
"channelId": "UCxxxxxxxxxxxxxxxxxxxxxxxx",
"numberOfSubscribers": 1000000,
"duration": "00:03:33",
"commentsCount": 12000,
"text": "Video description...",
"translatedText": null,
"descriptionLinks": [
{ "url": "https://example.com", "text": "https://example.com" }
],
"subtitles": ["en"],
"transcript": null,
"transcriptLanguage": "en",
"transcriptFormat": "srt",
"order": 0,
"commentsTurnedOff": false,
"fromYTUrl": "https://www.youtube.com/results?search_query=crawlee",
"isMonetized": null,
"hashtags": ["#example"],
"isCreativeCommons": true,
"isPurchased": false"
}

Note: Some fields may be null when not present on the page or when detection is not possible (e.g., likes, commentsCount, numberOfSubscribers, subtitles, transcript).

FAQ

Do I need a YouTube API key?

No. The actor scrapes public web endpoints and page data directly, so no official YouTube API key is required.

Can this extract transcripts or subtitles?

Yes. Enable “downloadSubtitles,” choose a “subtitlesLanguage,” and select a “subtitlesFormat” (srt, text, or timestamp). You can also “preferAutoGenerated” and optionally “saveSubtitlesToKvs.”

Does it scrape comments?

It extracts commentsCount when available, but it does not scrape individual comment bodies. The output includes totals and core engagement metrics.

Can I scrape YouTube Shorts and live streams?

Yes. Use maxShorts and maxStreams to control how many Shorts and live/upcoming streams are included per search term. You can also filter by isLive in post-processing.

Does it support direct URLs?

Yes — provide direct video URLs in startUrls to fetch full metadata and optional transcripts. The actor also works as a YouTube search results scraper via searchTerms.

How does the actor handle blocking?

It automatically escalates through connection levels: direct → Apify datacenter → Apify residential (with retries) and continues with a working level for the rest of the run.

What filters and sorting are available?

You can filter by isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased, and apply dateFilter, videoTypeFilter, lengthFilter. Sorting options include sortingOrder and sortBy.

What formats can I export?

All results are stored in the Apify dataset, ready for export to JSON, CSV, or Excel. This makes it a reliable YouTube data extractor for analytics and reporting.

Closing CTA / Final thoughts

The Youtube Scraper is built to extract structured YouTube video data at scale with accuracy and reliability. With robust anti-blocking, flexible filters, and optional transcript downloads, it serves marketers, developers, data analysts, and researchers who need clean, export-ready results. Use it as a YouTube data scraper for SEO tracking, competitor monitoring, and research pipelines — and plug the dataset into your automation or BI stack to start extracting smarter insights today.