Facebook Video Transcript Extractor avatar

Facebook Video Transcript Extractor

Pricing

$5.00/month + usage

Go to Apify Store
Facebook Video Transcript Extractor

Facebook Video Transcript Extractor

Extract transcripts from Facebook video

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

ius iyb

ius iyb

Maintained by Community

Actor stats

5

Bookmarked

198

Total users

3

Monthly active users

10 months ago

Last modified

Share

This Actor extracts transcripts from Facebook video pages. It's designed to help you obtain text transcripts from videos posted on Facebook's platform.

Features

  • Extracts transcript data from Facebook video pages
  • Handles proper request headers to mimic a real browser
  • Provides detailed error reporting
  • Works with Apify proxy to avoid IP blocks and rate limiting
  • Simple configuration through INPUT_SCHEMA

Usage

Input Configuration

The Actor accepts the following input parameters:

FieldTypeDescription
urlStringRequired URL of the Facebook video page from which to extract the transcript
proxyConfigurationObjectOptional proxy settings to route requests through

Example input:

{
"url": "https://web.facebook.com/briantylercohen/videos/1350752639547526",
"proxyConfiguration": {
"useApifyProxy": true
}
}

Running the Actor

  1. Apify Platform: The easiest way to run the Actor is through the Apify platform. Just search for "Facebook Video Transcript Extractor" in the Apify Store.

  2. Command Line (via Apify CLI):

    apify run -p
  3. API: You can also run the Actor programmatically via the Apify API.

Output

The Actor saves extracted transcripts to the default dataset. Each item in the dataset has the following structure:

{
"url": "https://web.facebook.com/briantylercohen/videos/1350752639547526",
"transcript": "This is the extracted transcript text...",
}

In case of errors or if no transcript is found, the output will look like:

{
"url": "https://web.facebook.com/briantylercohen/videos/1350752639547526",
"transcript": null,
"error": "Error message or 'No transcript found in the page'"
}

Limitations

  • This Actor relies on the current structure of Facebook's video pages. If Facebook changes their page structure or how transcripts are embedded, the Actor may need to be updated.
  • Facebook may rate-limit or block requests that appear automated. Using the Apify proxy helps mitigate this issue.
  • Not all Facebook videos have transcripts available.

Technical Details

The Actor performs the following steps:

  1. Takes the input URL and configures the HTTP request with browser-like headers
  2. Fetches the HTML content of the Facebook video page
  3. Parses the page to locate script tags containing transcript data
  4. Extracts the transcript using a regex pattern
  5. Saves the results to the Apify dataset

Dependencies

  • axios: For making HTTP requests
  • jsdom: For parsing and traversing the HTML
  • apify: The Apify SDK for integrating with the Apify platform

License

This project is licensed under the Apache License 2.0.