Youtube Scraper
Pricing
from $4.99 / 1,000 results
Youtube Scraper
🎥 YouTube Scraper extracts structured data from videos, channels & playlists — titles, tags, views, likes, comments, captions, thumbnails & publish dates. 🔎 Perfect for SEO, competitor analysis, research & reporting. 🚀 Export-ready for CSV/JSON pipelines.
Pricing
from $4.99 / 1,000 results
Rating
0.0
(0)
Developer
Scraper Engine
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
5 days ago
Last modified
Categories
Share
Youtube Scraper
The Youtube Scraper is a production-ready Apify actor that extracts structured data from YouTube search results and direct video URLs — including titles, views, likes, comments count, subscriber counts, descriptions, hashtags, thumbnails, publish dates, and optional transcripts/subtitles. This YouTube scraper tool solves the challenge of collecting clean, export-ready YouTube video metadata at scale without the official API, making it ideal for marketers, developers, data analysts, and researchers. With robust anti-blocking, it enables reliable pipelines for YouTube competitor analysis scraper workflows, SEO tracking, and reporting.
What data / output can you get?
Below are the main fields pushed to the Apify dataset by the Youtube Scraper. These map directly to the actor’s output and are ready to export as CSV, JSON, or Excel.
| Data type | Description | Example value |
|---|---|---|
| title | Video title | “How to use Crawlee in 10 minutes” |
| type | Content type (video or shorts) | “video” |
| id | YouTube video ID | “dQw4w9WgXcQ” |
| url | Canonical URL (shorts/videos) | “https://www.youtube.com/watch?v=dQw4w9WgXcQ” |
| thumbnailUrl | High-quality thumbnail URL | “https://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg” |
| viewCount | Parsed integer view count | 1500000 |
| date | ISO-like published date (when available) | “2025-01-15T00:00:00.000Z” |
| likes | Total likes (when detected) | 50000 |
| duration | HH:MM:SS (or null if unknown) | “00:03:33” |
| channelName | Channel display name | “Channel Name” |
| channelUrl | Absolute channel URL | “https://www.youtube.com/@channelname” |
| numberOfSubscribers | Subscriber count (when available) | 1000000 |
| commentsCount | Total comments (when detected) | 12000 |
| text | Description/snippet text | “Video description…” |
| descriptionLinks | URLs and hashtag links extracted from description | [{"url":"https://example.com%22,%22text%22:%22https://example.com"}] |
| subtitles | Available subtitle language codes (when detected) | ["en","es"] |
| hashtags | Hashtags from title/description | ["#example","#tutorial"] |
| fromYTUrl | Source YouTube results/seed URL | “https://www.youtube.com/results?search_query=crawlee” |
| order | Item index in run | 0 |
| isCreativeCommons | Creative Commons flag (best-effort) | true |
| isPurchased | Purchased/paid flag (best-effort) | false |
Bonus (when subtitles are downloaded): transcript, transcriptLanguage, transcriptFormat. Additional flags include commentsTurnedOff, isMonetized (when present). All outputs are pushed via Actor.pushData for seamless exports to CSV/JSON/Excel.
Key features
- 🛡️ Smart anti-blocking & proxy escalation — Automatically escalates from direct → Apify datacenter → Apify residential with retries, then sticks to a working level for the rest of the run.
- 🧪 Realistic HTTP fingerprinting — Uses the impit HTTP client to impersonate modern browsers and bypass TLS/HTTP fingerprinting checks reliably.
- ⚡ Concurrent metadata enrichment — Batch-fetches video pages with controlled concurrency to enrich likes, commentsCount, numberOfSubscribers, subtitles, and more.
- 🎯 Flexible search filters — Apply post-processing filters and sorting: dateFilter, videoTypeFilter, lengthFilter, sortingOrder, and sortBy for reliable ordering and selection.
- 🎞️ Quality & format filters — Filter for isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased to build high-signal datasets.
- 💬 Transcripts & subtitles — Toggle downloadSubtitles with subtitlesLanguage, subtitlesFormat (srt, text, timestamp), and preferAutoGenerated for broader coverage.
- 🔎 YouTube search results scraper — Scrape videos from search terms at scale with pagination and safety limits.
- 🔗 Direct video URL support — Provide a list of video URLs to extract complete metadata and optional transcripts.
- 📦 Export-ready outputs — Structured fields for straightforward analytics, making it a robust YouTube data scraper for CSV/JSON pipelines.
How to use Youtube Scraper - step by step
- Create or log in to your Apify account at console.apify.com.
- Navigate to Actors and open “Youtube Scraper”.
- Add input:
- searchTerms as a list of keywords, or
- startUrls with direct video URLs.
- Configure limits and filters:
- maxVideos, maxShorts, maxStreams per search term.
- Quality/features (isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, isPurchased).
- Sorting and post-filters (sortingOrder, dateFilter, videoTypeFilter, lengthFilter, sortBy).
- Subtitles & transcripts:
- Enable downloadSubtitles, choose subtitlesLanguage and subtitlesFormat, and optionally preferAutoGenerated or saveSubtitlesToKvs.
- Proxy setup:
- Leave default or set proxyConfiguration; auto-fallback is built in if blocks occur.
- Click Start to run. Monitor progress in the Log tab (you’ll see page counts, filters applied, and proxy changes).
- Access results in the Dataset tab and export to JSON, CSV, or Excel.
Pro tip: Use the Apify dataset to plug this YouTube web scraping tool into your reporting or BI stack as a YouTube data extractor for SEO dashboards and competitor tracking.
Use cases
| Use case name | Description |
|---|---|
| SEO teams — video metadata tracking | Track titles, views, likes, and publish dates to benchmark performance and optimize rankings using a YouTube video metadata scraper. |
| Competitor research — content analysis | Monitor competitor uploads, extract hashtags and descriptions, and compare engagement for a YouTube competitor analysis scraper workflow. |
| Keyword discovery — search SERP mining | Use searchTerms to collect top results for queries and build a YouTube keyword scraper dataset for content planning. |
| Research & NLP — transcript collection | Enable subtitles download to power NLP pipelines or topic modeling with a YouTube transcript scraper and subtitles extractor. |
| Reporting & BI — export-ready metrics | Export structured fields to CSV/JSON/Excel for dashboards and periodic performance reporting using a YouTube data extractor. |
| Live/short-form monitoring — format filters | Filter by isLive or collect Shorts with caps (maxShorts, maxStreams) to build specialized watchlists. |
Why choose Youtube Scraper?
The Youtube Scraper is built for precision, scale, and reliability on the Apify platform.
- 🎯 Accurate metadata parsing from search + video pages (titles, views, likes, commentsCount, subscribers, hashtags).
- 🧬 Multiformat transcripts (SRT, text, timestamp) with language selection and auto-generated fallback support.
- 🚀 Scales with concurrency and robust pagination, ideal for batch YouTube data extraction.
- 🧩 Developer-friendly outputs with consistent JSON fields for analytics and ETL workflows.
- 🛡️ Safe, production-ready anti-blocking: automatic proxy escalation and browser impersonation via impit.
- 💰 Export-ready for CSV/JSON pipelines — perfect for SEO, reporting, and research.
- 🔄 More reliable than ad-hoc scripts or extensions, thanks to stable infrastructure and structured output.
In short: a dependable YouTube web scraping tool for teams that need consistent, structured video data at scale.
Is it legal / ethical to use Youtube Scraper?
Yes — when done responsibly. The actor collects data from publicly available YouTube pages and does not access private or password-protected content.
Guidelines for responsible use:
- Only use data from public pages.
- Respect copyright and licensing (e.g., check Creative Commons details before reuse).
- Comply with applicable regulations (e.g., GDPR, CCPA) and YouTube’s terms.
- Consult your legal team for edge cases or sensitive applications.
Input parameters & output format
Example JSON input
{"searchTerms": ["Crawlee", "data extraction"],"maxVideos": 10,"maxShorts": 0,"maxStreams": 0,"downloadSubtitles": true,"saveSubtitlesToKvs": false,"subtitlesLanguage": "en","preferAutoGenerated": false,"subtitlesFormat": "srt","sortingOrder": "relevance","dateFilter": "","videoTypeFilter": "","lengthFilter": "","isHD": false,"hasCC": false,"isCreativeCommons": false,"is3D": false,"isLive": false,"isPurchased": false,"is4K": false,"is360": false,"hasLocation": false,"isHDR": false,"isVR180": false,"publishedAfter": "","sortBy": "","proxyConfiguration": { "useApifyProxy": false },"startUrls": []}
Parameters (all optional; none are required):
- searchTerms (array) — Enter one or more YouTube search keywords. Default: [].
- maxVideos (integer) — Maximum regular videos per search term. Use 0 to skip. Default: 10.
- maxShorts (integer) — Maximum Shorts per search term. Use 0 to skip. Default: 0.
- maxStreams (integer) — Maximum live/upcoming streams per search term. Use 0 to skip. Default: 0.
- startUrls (array) — Provide direct YouTube video, channel, playlist, or results page URLs to scrape without using search terms. Default: [].
- downloadSubtitles (boolean) — Download subtitles/transcripts when available. Default: false.
- saveSubtitlesToKvs (boolean) — Store each transcript in the key-value store under its own key. Default: false.
- subtitlesLanguage (string) — Preferred language for subtitles/transcripts. Default: "en".
- preferAutoGenerated (boolean) — Prefer auto-generated subtitles. Default: false.
- subtitlesFormat (string) — "srt", "text", or "timestamp". Default: "srt".
- sortingOrder (string) — Post-processing sort: "", "relevance", "date", "viewCount", "rating". Default: "".
- dateFilter (string) — "", "hour", "today", "week", "month", "year". Default: "".
- videoTypeFilter (string) — "", "video", "channel", "playlist", "movie". Default: "".
- lengthFilter (string) — "", "short", "medium", "long". Default: "".
- isHD (boolean) — Only include HD videos (>=720p). Default: false.
- hasCC (boolean) — Require at least one non-auto CC track. Default: false.
- isCreativeCommons (boolean) — Include only Creative Commons videos. Default: false.
- is3D (boolean) — Include only 3D videos. Default: false.
- isLive (boolean) — Restrict to live/live-style content. Default: false.
- isPurchased (boolean) — Best-effort filter for purchased/paid content. Default: false.
- is4K (boolean) — Include only 4K (2160p) videos. Default: false.
- is360 (boolean) — Include only 360° videos. Default: false.
- hasLocation (boolean) — Include only videos with explicit location metadata. Default: false.
- isHDR (boolean) — Include only HDR videos. Default: false.
- isVR180 (boolean) — Include only VR180 videos. Default: false.
- publishedAfter (string) — Only include videos published after YYYY-MM-DD. Default: "".
- sortBy (string) — Post-sort by "", "date", "viewCount", or "likes". Default: "".
- proxyConfiguration (object) — Proxy settings; actor escalates if blocked. Default: {}.
Example JSON output
{"title": "How to use Crawlee in 10 minutes","translatedTitle": null,"type": "video","id": "dQw4w9WgXcQ","url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ","thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg","viewCount": 1500000,"date": "2025-01-15T00:00:00.000Z","likes": 50000,"location": null,"channelName": "Channel Name","channelUrl": "https://www.youtube.com/@channelname","channelUsername": "channelname","collaborators": null,"channelId": "UCxxxxxxxxxxxxxxxxxxxxxxxx","numberOfSubscribers": 1000000,"duration": "00:03:33","commentsCount": 12000,"text": "Video description...","translatedText": null,"descriptionLinks": [{ "url": "https://example.com", "text": "https://example.com" }],"subtitles": ["en"],"transcript": null,"transcriptLanguage": "en","transcriptFormat": "srt","order": 0,"commentsTurnedOff": false,"fromYTUrl": "https://www.youtube.com/results?search_query=crawlee","isMonetized": null,"hashtags": ["#example"],"isCreativeCommons": true,"isPurchased": false"}
Note: Some fields may be null when not present on the page or when detection is not possible (e.g., likes, commentsCount, numberOfSubscribers, subtitles, transcript).
FAQ
Do I need a YouTube API key?
No. The actor scrapes public web endpoints and page data directly, so no official YouTube API key is required.
Can this extract transcripts or subtitles?
Yes. Enable “downloadSubtitles,” choose a “subtitlesLanguage,” and select a “subtitlesFormat” (srt, text, or timestamp). You can also “preferAutoGenerated” and optionally “saveSubtitlesToKvs.”
Does it scrape comments?
It extracts commentsCount when available, but it does not scrape individual comment bodies. The output includes totals and core engagement metrics.
Can I scrape YouTube Shorts and live streams?
Yes. Use maxShorts and maxStreams to control how many Shorts and live/upcoming streams are included per search term. You can also filter by isLive in post-processing.
Does it support direct URLs?
Yes — provide direct video URLs in startUrls to fetch full metadata and optional transcripts. The actor also works as a YouTube search results scraper via searchTerms.
How does the actor handle blocking?
It automatically escalates through connection levels: direct → Apify datacenter → Apify residential (with retries) and continues with a working level for the rest of the run.
What filters and sorting are available?
You can filter by isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased, and apply dateFilter, videoTypeFilter, lengthFilter. Sorting options include sortingOrder and sortBy.
What formats can I export?
All results are stored in the Apify dataset, ready for export to JSON, CSV, or Excel. This makes it a reliable YouTube data extractor for analytics and reporting.
Closing CTA / Final thoughts
The Youtube Scraper is built to extract structured YouTube video data at scale with accuracy and reliability. With robust anti-blocking, flexible filters, and optional transcript downloads, it serves marketers, developers, data analysts, and researchers who need clean, export-ready results. Use it as a YouTube data scraper for SEO tracking, competitor monitoring, and research pipelines — and plug the dataset into your automation or BI stack to start extracting smarter insights today.