Pricing

from $20.00 / 1,000 results

Try for free

Go to Apify Store

AI Video Data Extractor: Youtube, Instagram, TikTok, FB, X, etc

Try for free

Turn any video into structured JSON with AI. Define your custom schema (strings, numbers, arrays, objects, enums, booleans), provide video URLs, and get perfectly formatted data back. Works on YouTube, TikTok, X, Instagram, Facebook. 99+ languages. Smart retry on rate limits. No parsing code needed.

Pricing

from $20.00 / 1,000 results

Rating

5.0

(1)

Developer

InVideoIQ

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

🚀 AI Video Data Extractor: Turn Any Video Into Structured JSON

Define a JSON schema. Paste video URLs from YouTube, TikTok, X, Instagram, Facebook, Vimeo, Loom, and more.

Get back clean, structured JSON that exactly matches your schema.

No transcript parsing. No brittle prompts. No inconsistent outputs.

Just structured data ready for APIs, databases, CRMs, spreadsheets, or AI pipelines.

💰 $0.02 per video • 🌍 99+ languages • ⚡ Fast extraction

🎯 What Can You Extract From Videos?

Most video AI tools give you summaries or transcripts.

Those are useful for reading, but hard to automate.

This actor goes one step further:

➡️ Extract structured data directly from videos

Examples of what you can extract:

companies mentioned
products and brands
people speaking
pricing or offers
claims or positioning
topics and key insights
quotes or hooks
FAQs and objections
CTAs or promotions

Instead of reading the video yourself, you get a structured dataset.

💡 Example: How To Extract Structured Data From a Video

Input

Video URL:

https://youtube.com/watch?v=veRbckoCwkc

Schema:

{
  "main_topic": {
    "type": "String",
    "description": "Main theme of the video"
  },
  "people_mentioned": {
    "type": "Array",
    "description": "People mentioned in the video",
    "items": {
      "type": "String",
      "description": "Name of a person mentioned"
    }
  },
  "tools_mentioned": {
    "type": "Array",
    "description": "Tools or products mentioned",
    "items": {
      "type": "String",
      "description": "Name of the tool"
    }
  }
}

Output

{
  "video_url": "...",
  "main_topic": "Optimizing productivity using AI tools",
  "people_mentioned": ["Sam Altman"],
  "tools_mentioned": ["ChatGPT", "Notion AI"]
}

Now your pipeline can store or process the data automatically.

✨ Why Use AI Video Data Extractor?

Most video tools produce:

transcripts
summaries
chat interfaces

Those are great for humans but difficult for automation.

This actor is designed for data pipelines and automation workflows.

Custom schema extraction

You define the exact JSON structure you want returned.

Multi-platform support

Works with video URLs from:

YouTube
TikTok
Instagram
Facebook
X (Twitter)
Vimeo
Loom
Dailymotion
Rumble

Built for automation

Outputs clean JSON ready for:

APIs
databases
spreadsheets
CRMs
LLM pipelines
data analysis

Fast extraction

If a transcript already exists, the actor goes directly to structured extraction.

Cost-efficient

Most standard runs cost:

$0.02 per result

Even when analyzing thousands of videos.

Works even when transcripts don't exist

For videos without subtitles, the actor can automatically generate a transcript first using speech to text models, then run extraction.

🔄 Extract Data From Videos Without Subtitles

Some videos (especially Instagram) do not provide subtitles.

When transcribe_if_transcript_missing is enabled, the actor automatically:

Attempts extraction using available transcripts
If none exist, generates a transcript via speech-to-text models
Caches the transcript at backend and runs the extraction again using the generated transcript

You don't need to handle transcription yourself.

This significantly improves success rate for social video sources.

Important notes:

Instagram videos go directly to transcription first
Transcription fallback only applies to non-YouTube platforms
Transcription adds an additional cost but only for the first time. Once the transcript is cached, you can run as many schemas on it without addition transcription costs

👥 Who Uses AI Video Data Extractor?

This actor is designed for teams that process video content at scale.

Lead generation teams

Extract:

company names
founders
pricing offers
CTAs
emails or phone numbers
pain points

Market research teams

Turn competitor videos into structured datasets:

topics
positioning
pricing
features
objections

Content teams

Extract:

hooks
key quotes
product mentions
content angles
topics

AI builders

Feed structured outputs into:

AI agents
RAG pipelines
enrichment workflows
scoring systems

Brand and e-commerce teams

Analyze social videos for:

product mentions
promotions
sentiment
creator messaging

📈 Use Cases for Video Data Extraction

Extract products, brands, prices, discounts, and claims from TikTok or Instagram videos
Convert YouTube interviews into speaker insights, key takeaways, objections, and company mentions
Turn webinars into FAQs, action items, roadmap items, and feature requests
Monitor creators for brand mentions or sponsorships trend spotting, competitor monitoring, and content research
Extract structured fields for CRM enrichment, sales intelligence, or knowledge base ingestion
Build datasets from video content for classification, benchmarking, or LLM evaluation

🛠️ How To Extract Data From Videos Using This Actor

Step 1: Open the actor and enter your video URLs

Paste one or more public video URLs into the video_urls field. You can mix platforms freely in a single run.

Step 2: Define your JSON schema

In the schema field, describe exactly which fields you want extracted. Each field needs a type and a description.

Step 3: (Optional) Add extraction instructions

Use what_to_extract to guide the AI with natural language, for example: "Focus on the products discussed and the speaker's opinion on each."

Step 4: Run and get structured JSON

The actor retrieves or generates the transcript, runs AI extraction, and returns clean JSON matching your schema: one dataset item per video. You can download the extracted dataset in JSON, CSV, Excel, or HTML format directly from the Apify dashboard.

Run it your way

Because this is an Apify Actor, you also get:

API access: Call it programmatically from any language or platform: check the API tab for ready-made code examples
Scheduling: Set up recurring runs to monitor video content automatically
Integrations: Connect to Zapier, Make, Google Sheets, webhooks, and more
Monitoring: Track run history, costs, and results from the Apify dashboard

🔗 Supported Video Platforms

Platform	Notes
YouTube	Full support, uses available subtitles
TikTok	Full support
Instagram	Requires `transcribe_if_transcript_missing` enabled (most Instagram videos lack subtitle tracks)
Facebook	Public videos
X (Twitter)	Paste the tweet URL containing the video
Loom	Full support
Dailymotion	Full support
Vimeo	Full support
Rumble	Full support

📋 Input and Output Example

{
  "main_topic": {
    "type": "String",
    "description": "The overarching theme of the discussion"
  },
  "summary": {
    "type": "String",
    "description": "A 3-4 sentence summary covering the key points, recommendations, and takeaways from the video"
  },
  "foundational_habits": {
    "type": "Array",
    "description": "Basic habits required before adding supplements such as sleep or nutrition",
    "items": {
      "type": "String",
      "description": "Name of a foundational habit"
    }
  },
  "supplements_mentioned": {
    "type": "Array",
    "description": "List of all supplements discussed",
    "items": {
      "type": "Object",
      "description": "Information about a specific supplement",
      "properties": {
        "name": {
          "type": "String",
          "description": "Name of the supplement"
        },
        "category": {
          "type": "Enum",
          "values": ["Fatty Acid", "Amino Acid/Protein", "Adaptogen", "Vitamin/Mineral", "Other"],
          "description": "Categorization of the supplement"
        },
        "recommended_dosage_mg": {
          "type": "Number",
          "description": "Recommended daily dosage in milligrams if mentioned. Use 0 if not mentioned."
        },
        "is_weight_dependent": {
          "type": "Boolean",
          "description": "Whether the dosage needs to be adjusted based on body weight"
        }
      }
    }
  }
}

Example Dataset Output

{
  "video_url": "https://www.youtube.com/watch?v=veRbckoCwkc",
  "main_topic": "Optimal Supplementation for Health and Performance",
  "summary": "The video explores how to optimize health and performance through targeted supplementation. It emphasizes that foundational habits like sleep, nutrition, and exercise must be in place before adding supplements. Key supplements discussed include Omega-3 fatty acids for general health and Creatine for performance, with specific dosage guidance provided.",
  "foundational_habits": [
    "Getting adequate sleep",
    "Proper nutrition and hydration",
    "Regular exercise routine"
  ],
  "supplements_mentioned": [
    {
      "name": "Omega-3 Fatty Acids",
      "category": "Fatty Acid",
      "recommended_dosage_mg": 1000,
      "is_weight_dependent": false
    },
    {
      "name": "Creatine Monohydrate",
      "category": "Amino Acid/Protein",
      "recommended_dosage_mg": 5000,
      "is_weight_dependent": true
    }
  ]
}

💳 How Much Does AI Video Data Extraction Cost?

Pricing is designed to stay affordable at scale. On the Apify free plan, you get $5 of platform usage credits per month, enough to run hundreds of extractions and test the actor before committing to a paid plan.

Standard extraction

$0.02 per result

In the normal case:

1 video = 1 result

Transcription fallback

If transcript generation is required:

+$0.035 per transcription

Long transcript scaling

Every 15,000 tokens counts as 1 billed result unit.

Approximate reference:

15,000 tokens ≈ 1 hour 15 minutes of speech

Examples

Normal extraction with total_tokens = 8,000: $0.02
Normal extraction with total_tokens = 12,000: $0.02
Successful extraction with total_tokens = 32,000: $0.06
Video "X" needs transcription fallback and extraction succeeds with total_tokens = 12,000: $0.055 the first time
A secondary extraction on the same video "X" succeeds with total_tokens = 12,000: $0.02

⚠️ Schema Rules

To keep extraction reliable:

Every field must include a description
Max 10 root fields
Max 3 nesting levels
Level 3 must contain only primitive values
Max 10 subfields per object

Supported types:

String
Number
Boolean
Integer
Array
Object
Enum

Tip: the best schemas are specific. Instead of asking for a vague "summary", define the business fields you actually want, such as products, pain_points, pricing, claims, cta, audience, or sentiment.

⚠️ Limitations

Transcript length limit: Very long videos (over 3 hours) may fail if the transcript exceeds the processing token limit.
Transcript availability: If a video has no available transcript, enable transcribe_if_transcript_missing to automatically generate one via speech-to-text. Currently, the transcription fallback does not support YouTube videos.
Fallback adds cost: Enabling transcript generation improves coverage but incurs an additional speech-to-text charge (only on the first run — transcripts are cached for subsequent extractions).

❓ FAQ

Is video data extraction legal?

This actor processes publicly available video content and does not extract private user data such as email addresses, gender, or location — only information that users have chosen to share publicly. However, results may contain personal data. Personal data is protected by the GDPR in the European Union and by other regulations around the world. You should not extract personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.

What happens if a video has no transcript?

If transcribe_if_transcript_missing is enabled, the actor automatically generates a transcript using speech-to-text and then runs extraction. This works for most social video platforms. YouTube videos currently use available subtitles only.

Can I run different schemas on the same video?

Yes. Run the same video through multiple schemas — one for lead gen, another for content research, another for brand sentiment — and merge the results in your pipeline. Transcription is cached after the first run, so subsequent extractions only incur the base extraction cost.

What languages are supported?

The actor supports 99+ languages. Results are returned in the same language used in your schema descriptions and field definitions.

How do I integrate the results into my workflow?

You can download results as JSON, CSV, Excel, or HTML from the Apify dashboard. You can also access results programmatically via the API tab, connect to Zapier, Make, Google Sheets, or use webhooks for real-time data transfer.

Use this actor when you need structured JSON answers from video content.

Use one of the related actors below when your need is different or combine them for more powerful workflows:

Video Transcript Extractor: pay per result, $10 / 1000 results. Best for transcript retrieval plus rich metadata.
Video Transcript Scraper: rental model, $20 / month + usage. Best if you prefer the rental pricing model for transcript and metadata retrieval.
Video Transcriber: best when you need speech-to-text for videos that do not already have transcripts or subtitles.

Workflow ideas

Transcript + Extraction: Use Video Transcript Extractor to get raw transcripts for archiving, then run this actor on the same URLs with a custom schema for structured insights, two complementary outputs from a single video library.
Social monitoring pipeline: Schedule this actor to run daily on new creator or competitor video URLs. Feed the structured JSON into Google Sheets, a database, or a webhook for automated alerting.
Multi-schema analysis: Run the same video through multiple schemas, one for lead gen fields, another for content research, another for brand sentiment and merge the results in your pipeline.

💬 Support

Feature requests and improvements are welcome.

Open an issue in the Issues tab if you need:

new schema capabilities
platform improvements
bug fixes
performance enhancements

Need a custom workflow or integration? Reach out through the Issues tab, we're happy to help tailor the actor to your use case.

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

invideoiq/video-transcript-scraper

Scrapes transcripts from online video/audio content on multiple plateforms (Youtube, X, ..) in any available language. It delivers outputs in both JSON and LLM-ready formats, making it ideal for analytics, and AI-based applications. Perfect for research and building intelligent conversational agents

InVideoIQ

1.5K

4.4

TikTok Transcript

agentx/tiktok-transcript

TikTok transcript API — pass any TikTok video URL and the response includes the spoken-word transcript with timestamps, the on-screen caption text, and an optional translation into any of 100+ languages. Works for short-form posts, longer videos, and embedded slideshow audio.

AgentX

427

5.0

YouTube Sponsorship & Brand Deal Tracker

seemuapps/youtube-sponsorship-tracker

Scan a YouTube channel's recent videos and extract every sponsor, brand deal, affiliate link, and promo code from the descriptions.

Andrew

TikTok Transcription AI - 1.5$ for 1000 Videos

lofomachines/tiktok-transcription-ai

Transcribe a list of TikTok video URLs. Extracts TikTok metadata and generates AI transcription with timestamps and plain text.

Lofomachines

131

Reddit Posts Search Scraper

vulnv/reddit-posts-search-scraper

Search and scrape Reddit posts by keyword. Extract detailed post data, comments, scores, timestamps, and metadata for research and analysis.

VulnV

573

5.0

TikTok Transcript Scraper

crawlerbros/tiktok-transcript-scraper

Extract transcripts and subtitles from TikTok videos in all available languages. Returns timestamped segments plus full plain-text transcript per language.

Crawler Bros

134

YouTube Scraper

automation-lab/youtube-scraper

Scrape YouTube videos, channels, and comments at scale. Search by keyword, get video details (views, likes, duration, category), channel profiles (subscribers, verification), and comments. Uses YouTube InnerTube API — no browser, no API key needed.

Stas Persiianenko

141

Video Transcript Extractor: Youtube, X, Facebook, Tiktok, etc.

invideoiq/video-transcript-extractor

InVideoIQ

5.0

YouTube Scraper + AI Comment Analyzer & Channel Audit

buseta/youtube-scraper

Scrape YouTube channels, videos, comments, transcripts, and search results. AI analyzes comments for viewer requests and sentiment, audits channel health with recommendations, and evaluates niche saturation with content gap analysis.

buseta

Instagram Profile Posts Scraper

futurizerush/instagram-profile-posts-scraper

Get posts, reels, and carousels from any public Instagram profile. Includes download links, likes, comments, captions, and more. Works with up to 10 profiles at once. No login required.

Rush

241

5.0

AI Video Data Extractor: Youtube, Instagram, TikTok, FB, X, etc

🚀 AI Video Data Extractor: Turn Any Video Into Structured JSON

🎯 What Can You Extract From Videos?

💡 Example: How To Extract Structured Data From a Video

Input

Output

✨ Why Use AI Video Data Extractor?

Custom schema extraction

Multi-platform support

Built for automation

Fast extraction

Cost-efficient

Works even when transcripts don't exist

🔄 Extract Data From Videos Without Subtitles

👥 Who Uses AI Video Data Extractor?

Lead generation teams

Market research teams

Content teams

AI builders

Brand and e-commerce teams

📈 Use Cases for Video Data Extraction

🛠️ How To Extract Data From Videos Using This Actor

Step 1: Open the actor and enter your video URLs

Step 2: Define your JSON schema

Step 3: (Optional) Add extraction instructions

Step 4: Run and get structured JSON

Run it your way

🔗 Supported Video Platforms

📋 Input and Output Example

Example Dataset Output

💳 How Much Does AI Video Data Extraction Cost?

Standard extraction

Transcription fallback

Long transcript scaling

Examples

⚠️ Schema Rules

⚠️ Limitations

❓ FAQ

Is video data extraction legal?

What happens if a video has no transcript?

Can I run different schemas on the same video?

What languages are supported?

How do I integrate the results into my workflow?

🔗 Related Actors and Integration Ideas

Workflow ideas

💬 Support

You might also like

Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.

TikTok Transcript

YouTube Sponsorship & Brand Deal Tracker

TikTok Transcription AI - 1.5$ for 1000 Videos

Reddit Posts Search Scraper

TikTok Transcript Scraper

YouTube Scraper

Video Transcript Extractor: Youtube, X, Facebook, Tiktok, etc.

YouTube Scraper + AI Comment Analyzer & Channel Audit

Instagram Profile Posts Scraper