YouTube Scraper

  • bernardo/youtube-scraper
  • Modified
  • Users 5.3k
  • Runs 74.6k
  • Created by Author's avatarBernard O.

YouTube crawler and video scraper. Alternative YouTube API with no limits or quotas. Extract and download channel name, likes, number of views, and number of subscribers.

What does YouTube Scraper do?

YouTube Scraper is an unofficial YouTube API that lets you go beyond the limitations of the official YouTube Data API. YouTube Scraper lets you extract information from the video platform without restrictions in the form of quotas or units and get unlimited data on:

  • a list of YouTube search results based on a search term
  • individual video information: view count, description, release date, number of likes, comments, duration, URL, and even subtitles (both user and auto-generated)
  • channel details: number of subscribers, description, and videos

How to scrape YouTube data?

You can either scrape data by inputting a search term or a direct URL of a video, search results page, or channel. If both fields are filled out, the scraper prioritizes the URL input. Check out How to scrape YouTube for a step-by-step tutorial.

Why scrape YouTube?

  • Monitor the market: see mentions of your brand, the position of your content in search results or get insights into the activity of competitors
  • Find current trends and opinions shared by content creators and commenting users
  • Filter your search results based on more advanced criteria
  • Identify harmful or illegal comments and videos
  • Scrape subtitles for offline reading or increased accessibility
  • Accumulate information on products and services from video reviews and automate your buying decisions

Scraping YouTube is legal as long as you adhere to regulations concerning copyright and personal data. This scraper deals with cookies and privacy consent dialogs on your behalf, so be aware that the results from your YouTube scrape might contain personal information.

Personal data is protected by GDPR (EU Regulation 2016/679), and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so.

If you're unsure whether your reason is legitimate, please consult your lawyers. You can also read our blog post on the legality of web scraping.

How much does it cost to scrape YouTube?

With Apify's Free Plan, you get $5 of usage credits each month, which translates into around 2,000 scraped items from YouTube. Check out our platform pricing page if you want to scrape data in bulk and need more credits.

Should I use a proxy when scraping YouTube?

Just like with other social media-related actors, using a proxy is essential if you want your scraper to run properly. You can either use your own proxy or stick to the default Apify Proxy servers. Datacenter proxies are recommended for use with this actor.

Input parameters

You can either use the user-friendly UI in Apify Console to set up your actor or input it directly via JSON. YouTube scraper recognizes these fields:

searchKeywords - a YouTube search query can be used instead of a URL

startUrls - insert a specific YouTube link to scrape search result pages, channels, or videos

maxResults - set how many videos should be scraped from each search query or channel

maxComments - limit the number of comments that you want to scrape

downloadSubtitles - scrape user-created or auto-generated captions and convert them to .srt format

  • subtitlesLanguage - only download subtitles in selected language

  • preferAutoGeneratedSubtitles - prefer autogenerated speech-to-text subtitles to user-created ones

  • saveSubsToKVS - saves the scraped subtitles to the Apify Key-Value Store.

proxyConfiguration (required) - configure proxy settings

verboseLog - turn on verbose logging for accurate monitoring and for more comprehensive information about the runs

Go to the Input Schema tab to see more technical details on the scraper's input.

Here are examples of the schema in JSON for the various types of input:

Scraping by URL

Input a link to a video, search results page or a YouTube channel:

{
    "downloadSubtitles": false,
    "preferAutoGeneratedSubtitles": false,
    "proxyConfiguration": {
        "useApifyProxy": true
    },
    "saveSubsToKVS": false,
    "simplifiedInformation": false,
    "startUrls": [
        {
            "url": "https://www.youtube.com/watch?v=oxy8udgWRmo"
        }
    ],
    "verboseLog": false
}

Scraping by search term

Insert keywords that you would normally use in the YouTube search bar:

{
    "downloadSubtitles": false,
    "maxResults": 10,
    "preferAutoGeneratedSubtitles": false,
    "proxyConfiguration": {
        "useApifyProxy": true
    },
    "saveSubsToKVS": false,
    "searchKeywords": "terminator dark fate trailer",
    "simplifiedInformation": false,
    "verboseLog": false
}

Output

After the scrape has completed, you can download your data in a number of formats, including JSON, CSV, XML, RSS, or as an HTML table. Here's an output example in JSON:

{
  "title": "Terminator: Dark Fate - Official Trailer (2019) - Paramount Pictures",
  "id": "oxy8udgWRmo",
  "url": "https://www.youtube.com/watch?v=oxy8udgWRmo",
  "viewCount": 19826925,
  "date": "2019-08-29T00:00:00+00:00",
  "likes": 144263,
  "dislikes": null,
  "location": "DOUBLE DOSE CAFÉ",
  "channelName": "Paramount Pictures",
  "channelUrl": "https://www.youtube.com/c/paramountpictures",
  "numberOfSubscribers": 2680000,
  "duration": "2:34",
  "commentsCount": 25236,
  "details": "<span dir=\"auto\" class=\"style-sco..."
}

Notes on customizing the actor

Extend output function

Extend output function allows you to omit output, add some extra properties to the output by using the page variable or change the shape of your output altogether:

async ({ item }) => {
    // remove information from the item
    item.details = undefined;
    // or delete item.details;
    return item;
}
async ({ item, page }) => {
    // add more info, in this case, the shortLink for the video
    const shortLink = await page.evaluate(() => {
        const link = document.querySelector('link[rel="shortlinkUrl"]');
        if (link) {
            return link.href;
        }
    });
    return {
        ...item,
        shortLink,
    }
}
async ({ item }) => {
    // omit item, just return null
    return null;
}

Extend scraper function

Extend scraper function allows you to add functionality to the existing baseline behavior. For example, you may enqueue related videos, but not recursively:

async ({ page, request, requestQueue, customData, Apify }) => {
    if (request.userData.label === 'DETAIL' && !request.userData.isRelated) {
        await page.waitForSelector('ytd-watch-next-secondary-results-renderer');
        const related = await page.evaluate(() => {
            return [...document.querySelectorAll('ytd-watch-next-secondary-results-renderer a[href*="watch?v="]')].map(a => a.href);
        });
        for (const url of related) {
            await requestQueue.addRequest({
                url,
                userData: {
                    label: 'DETAIL',
                    isRelated: true,
                },
            });
        }
    }
}

NB: If this specific function throws an exception, it will retry the same URL it was visiting again.

Other video and social media scrapers

We have other video-related scrapers in stock for you; to see more of those, check out the Video Category in Apify Store or the compilation of Social Media Scrapers.

Integrations and Youtube Scraper

Last but not least, YouTube Scraper can be connected with almost any cloud service or web app thanks to integrations on the Apify platform. You can integrate with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and more. Or you can use webhooks to carry out an action whenever an event occurs, e.g. get a notification whenever YouTube Scraper successfully finishes a run.

Using Youtube Scraper with the Apify API

The Apify API gives you programmatic access to the Apify platform. The API is organized around RESTful HTTP endpoints that enable you to manage, schedule, and run Apify actors. The API also lets you access any datasets, monitor actor performance, fetch results, create and update versions, and more.

To access the API using Node.js, use the apify-client NPM package. To access the API using Python, use the apify-client PyPI package. Check out the Apify API reference docs for full details or click on the API tab for code examples.

Your feedback

We're always working on improving the performance of our actors. So if you've got any technical feedback on YouTube Scraper, or simply found a bug, please create an issue on the actor's Issues tab in Apify Console.

Industries

See how YouTube Scraper is used in industries around the world