YouTube Video Reverse Engineer AI Script, Frame & Hook Analysis
Under maintenancePricing
from $0.30 / video minute (full)
YouTube Video Reverse Engineer AI Script, Frame & Hook Analysis
Under maintenanceTurn any YouTube video into a blueprint. AI extracts hook formulas, script structure, retention techniques, style DNA, and audience engagement from top comments. Outputs are ready-to-use prompts — feed to ChatGPT for scripts or Midjurney for visuals. Premium adds frame-by-frame visual analysis.
Pricing
from $0.30 / video minute (full)
Rating
0.0
(0)
Developer
Yuliia Kulakova
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Share
YouTube Video Reverse Engineer — AI Script & Hook Analysis

Reverse-engineer any YouTube video with AI. Extract the complete content blueprint: script structure, hook formulas, retention techniques, style DNA, pacing analysis, thumbnail psychology, visual production patterns, and ready-to-use recreation prompts. Decode why videos go viral and replicate the exact formula for your content.
What does this YouTube video analyzer do?
This AI-powered YouTube video analysis tool deconstructs videos into their structural DNA. Unlike simple transcript scrapers or metadata extractors, it identifies why a video works by mapping every storytelling technique, retention hook, emotional trigger, and visual pattern used throughout the video.
Feed it any YouTube URL — from MrBeast challenges to tech tutorials to educational content — and get back a complete structural blueprint you can adapt to your own niche.
Key outputs:
- Script structure breakdown — Hook type/timing/score, intro thesis, body segments with content types, CTA analysis, section timing percentages, words-per-minute pacing
- Style DNA extraction — Niche classification, target audience profiling, sentence rhythm patterns, tone mapping, script flow analysis, transition styles, curiosity gap mechanics with timestamps
- Retention technique identification — Open loops, pattern interrupts, callbacks, running bits, escalation patterns, emotional beats — each with exact timestamps and mechanical explanations
- Recreation prompts — A complete AI-ready prompt that captures the video's structural blueprint, plus title formulas, hook templates, and section-by-section instructions
- Visual analysis (Premium) — Frame-by-frame breakdown: shot types, on-screen text, camera work, lighting, color palette, transitions, visual retention techniques
- Visual Style Profile (Premium) — Complete visual DNA including color grading, composition patterns, and a master image style prompt for consistent visual recreation
- Thumbnail analysis (Premium) — Color psychology, composition techniques, emotion detection, curiosity elements, CTR score, plus alternative thumbnail concepts with image prompts
How it works
Step 1: Paste YouTube URL(s) → choose Full or PremiumStep 2: AI extracts transcript, analyzes structure, identifies techniquesStep 3: Get a complete video blueprint with recreation prompts in structured JSON
The actor extracts the transcript (auto-captions or manual), preprocesses it (WPM calculation, pause detection, natural break identification), then runs multi-pass AI analysis to identify structure, map retention techniques, extract style DNA, and generate actionable recreation prompts.
Premium tier additionally extracts video frames and runs visual AI analysis to build a complete Visual Style Profile and thumbnail breakdown.
Pricing
| Tier | Base Price | What You Get |
|---|---|---|
| Full | $0.25 / minute of video | Structure + Style DNA + retention techniques + recreation prompts |
| Premium | $0.45 / minute of video | Full + thumbnail analysis + frame-by-frame visual analysis + Visual Style Profile |
Frame density multiplier (Premium only): The default screenshot interval is 15 seconds. Denser intervals (more frames = more AI analysis) cost proportionally more:
- 15s interval = base rate ($0.45/min)
- 10s interval = +50% on frame analysis surcharge
- 5s interval = +200% on frame analysis surcharge
Examples:
- 10-min video, Full depth = $2.50
- 14-min video, Premium (15s interval) = $6.30
- 14-min video, Premium (5s interval) = $11.90
- 45-min video, Full depth = $11.25
Use Dry Run mode to preview the exact cost before processing — returns video metadata, estimated frames, and price with zero charges.
Input example
{"urls": ["https://www.youtube.com/watch?v=fMfipiV_17o"],"language": "en","analysisDepth": "full","recreationTopic": "productivity tips for developers","dryRun": false}
| Parameter | Description | Default |
|---|---|---|
urls | YouTube video URLs (batch supported) | required |
language | Preferred transcript language (ISO code) | "en" |
analysisDepth | "full" or "premium" | "full" |
screenshotIntervalSec | Premium: frame extraction interval | 15 |
recreationTopic | Adapt recreation prompt to your niche | optional |
dryRun | Preview cost without processing | false |
Output example (real MrBeast analysis)
Analyzed: "Would You Sit In Snakes For $10,000?" by MrBeast (14 min, 3,436 words)
{"url": "https://www.youtube.com/watch?v=fMfipiV_17o","videoId": "fMfipiV_17o","videoTitle": "Would You Sit In Snakes For $10,000?","channel": "MrBeast","durationSeconds": 846,"transcriptWordCount": 3436,"avgWordsPerMinute": 244,"analysisDepth": "full","structure": {"hook": {"type": "immediate_challenge","start_sec": 0,"end_sec": 18,"summary": "Opens with a bathtub full of snakes, immediate challenge proposition ('sit in this tub for $10K for your mom'), and Chandler instantly leaving before MrBeast finishes. Zero preamble — viewer drops into stakes + visual spectacle + comedy.","score": 9,"score_reasoning": "Perfect MrBeast hook formula: (1) visual spectacle immediately, (2) money stakes in first 5 seconds, (3) comedy payoff within 10 seconds, (4) emotional stakes (for your mom). Self-contained micro-story optimized for retention."},"intro": {"start_sec": 18,"end_sec": 35,"thesis": "A compilation of escalating physical/psychological challenges where contestants face fears for cash prizes — testing how far people will go for money."},"body_segments": [{"title": "Snake Bathtub ($10K)","start_sec": 0,"end_sec": 80,"key_points": ["Chandler's instant refusal = comedy gold", "20 snakes added progressively builds tension", "Prize goes to mom (emotional stakes)"]},{"title": "Cockroach Money Grab ($9,340)","start_sec": 80,"end_sec": 140,"key_points": ["Random crew member inclusion", "MrBeast's chaotic time-keeping = comedy", "Oddball dollar amount adds authenticity"]}],"total_body_segments": 11,"overall_structure": "rapid_compilation"},"style_dna": {"niche": "entertainment / challenge / philanthropy","target_audience": "Ages 8-25, primarily male, global. YouTube-native viewers who enjoy spectacle, money, and fast-paced entertainment.","hook_style": "cold_open_spectacle — zero preamble. Visual spectacle + money stakes + comedy all within 15 seconds.","script_flow": "rapid_compilation — 11+ discrete challenges in 14 minutes (~75 seconds per segment). Each is a self-contained mini-story.","sentence_rhythm": "staccato_chaotic — rapid-fire dialogue, short interrupted sentences. Universal comprehension across ages and languages.","tone": "chaotic_generous — oscillates between genuine excitement at giving money, deliberate trolling, and chaotic energy.","retention_techniques": ["cold_open — no intro, no logo, first frame is the challenge itself","running_bit_thread — Noah's curling marathon spans entire video, provides continuity","segment_variety — no two segments use the same mechanic","escalating_money — prizes climb: $9K → $10K → $20K → $52K → car","character_callbacks — Chandler's phobia, Karl's island loss, Chris's competitive streak"],"curiosity_gaps": [{"timestamp_sec": 0, "text": "If any of you sits in this tub of snakes...", "mechanic": "immediate_stakes"},{"timestamp_sec": 400, "text": "I'm gonna count your reps... but I'm not gonna tell you when to stop", "mechanic": "unknown_endpoint"}]},"recreation": {"recreation_prompt": "Create a 14-minute rapid-fire challenge compilation video script with 10-12 discrete segments. Follow this structural blueprint:\n\nCOLD OPEN (2%): Open on the most visually spectacular challenge. No intro, no greeting. First sentence states challenge and money stakes. Second beat: someone reacts comically. Third beat: someone does it.\n\nSEGMENTS (96%): 10-12 discrete challenges rotating between types: FEAR, SKILL/ENDURANCE, GUESSING/LUCK, DOUBLE-OR-NOTHING, WHOLESOME SURPRISE...","title_formulas": ["Would You [Scary Action] For $[Amount]?","I Gave [Person] $[Amount] If They [Challenge]","$[Amount] vs [Scary Thing] — Who Wins?"],"hook_template": "Open cold on [VISUAL SPECTACLE]. State '$[AMOUNT] if you [CHALLENGE]' within 3 seconds. [PERSON] immediately refuses/fails for comedy. [ANOTHER PERSON] steps up — tension begins."},"transcript": "[00:00] This is a bathtub full of snakes. Hey there, little guy...\n[00:05] If any of you sits in this tub of snakes, I'll give your mom $10,000...\n..."}
Note: Output truncated for display. Full output includes all 11 body segments, 9 retention techniques, complete recreation prompt with per-section instructions, and full timestamped transcript.
Premium tier additional output
Premium adds three powerful visual analysis layers on top of Full. Real example from a TED Talk analysis:
{"thumbnail_analysis": {"dominant_colors": ["#8B2F8B (deep magenta — TED curtains)","#4A4A4A (charcoal — speaker's blazer)","#6B8FBF (steel blue — shirt)","#D4A76A (warm skin tones)"],"text_overlay": "No text overlay on the thumbnail itself.","faces": {"count": 1,"primary_face": "Middle-aged man with gray curly hair, captured mid-gesture — mouth slightly open, eyes focused. Confident and animated.","expression": "animated_engaged","eye_contact": false},"composition": "Rule of thirds with speaker right-center. Raised hand creates diagonal line. Purple TED curtain provides rich backdrop.","click_score": 6,"click_reasoning": "Standard TED thumbnail — professional but not optimized for CTR. No text overlay, no eye contact, muted colors.","thumbnail_concepts": [{"concept": "Brain Under Siege","description": "Split-face: left half calm, right half stressed with red tint and chaotic symbols. Text: 'YOUR BRAIN IS LYING TO YOU'","image_prompt": "Close-up split portrait divided down center. Left: calm, blue-toned, organized. Right: stressed, reddened skin, floating chaotic symbols (clock, pills, keys). Bold white text at bottom. Black background."},{"concept": "Pre-Mortem Checklist","description": "Overhead shot of a neat checklist with red X marks through disaster scenarios","image_prompt": "Top-down flatlay: white paper checklist with handwritten items, several crossed out in red marker. Coffee cup corner. Warm lighting. Text overlay: 'THE CHECKLIST THAT SAVES LIVES'"}]},"visual_analysis": [{"frame_index": 0,"timestamp_sec": 0,"shot_type": "medium_shot — chest-up framing capturing speaker's upper body and hands","on_screen_text": "None — clean frame","visual_elements": ["Speaker center-right in charcoal blazer and steel blue shirt","Right hand raised near temple (brain/thinking gesture)","Deep magenta-purple TED curtain filling background","Metal staircase visible upper-left adding depth"],"camera_angle": "eye-level, slightly left of center","camera_movement": "static — locked-off tripod","lighting": "Professional three-point stage lighting with warm golden skin tones against cool purple backdrop","color_palette": ["#8B2F8B", "#4A4A4A", "#D4A76A"]}],"visual_style_profile": {"art_style": "Classic TED Talk production — single-speaker stage with professional multi-camera capture. Clean, polished broadcast quality.","color_palette": "Deep magenta-purple (55%), charcoal gray (15%), near-black (15%), steel blue (8%), warm gold skin tones (7%).","lighting_style": "Professional three-point: key light above-front, fill from sides, purple color wash on backdrop.","camera_work": "Medium shot dominant. Slow, deliberate movements. TED's 'invisible production' philosophy.","composition_patterns": "Consistent rule-of-thirds. Speaker's hand gesture creates diagonal leading line. Strong negative space balance.","visual_retention_techniques": "Minimal — content carries attention. Hand gestures provide visual interest at key points.","master_image_style_prompt": "Professional TED-style stage photography: single speaker on minimalist stage with deep magenta-purple curtain backdrop, warm golden three-point lighting highlighting animated hand gestures. Business-casual wardrobe. Medium shot chest-up framing. Clean, authoritative, educational tone."}}
The master_image_style_prompt can be fed directly into image generation tools (Midjourney, DALL-E, Stable Diffusion) to recreate the visual style for your own content.
Who is this for?
- YouTube creators — Analyze why competitor videos outperform yours. Get the exact hook formula, pacing, and retention techniques in seconds instead of hours of manual note-taking.
- Content agencies — Deconstruct client competitors at scale. Batch-analyze 50 videos to find recurring patterns that drive views and engagement.
- Scriptwriters — Extract proven video structure templates. Get hook formulas, emotional arc patterns, and pacing rhythms from any successful video.
- Course creators — Study what makes educational content engaging. Identify optimal pacing, curiosity gaps, and knowledge delivery patterns.
- SEO and growth teams — Extract title formulas and hook patterns for data-driven A/B testing of YouTube content strategy.
- AI content creators — Video-to-prompt pipeline. Get ready-to-use prompts for script generation, image generation, and thumbnail creation.
Dry run (cost preview)
{"urls": ["https://www.youtube.com/watch?v=fMfipiV_17o"],"dryRun": true,"analysisDepth": "premium"}
Returns:
{"videoId": "fMfipiV_17o","title": "Would You Sit In Snakes For $10,000?","channel": "MrBeast","durationSeconds": 846,"durationMinutes": 15,"transcriptTokens": 5571,"estimatedCost": 6.75,"pricePerMinute": 0.45,"captionsAvailable": true,"dryRun": true}
Frequently asked questions
How is this different from YouTube transcript scrapers? Transcript scrapers give you raw text. This actor analyzes the text with AI to extract the structural blueprint — hook formulas, retention techniques, emotional arcs, pacing patterns — and generates actionable recreation prompts. It's the difference between reading sheet music and understanding music theory.
What languages are supported?
Any language with YouTube captions available. Set the language parameter to the ISO code (e.g., "es", "de", "ja", "ru", "ko"). Falls back to the first available caption track if your preferred language isn't found.
Can I batch-analyze an entire YouTube channel?
Yes. Pass multiple URLs in the urls array to batch-analyze videos. Analyze 10-50 videos from one channel to identify their recurring patterns, hook formulas, and content evolution over time.
What if a video has no captions? The actor returns a clear error for that video and continues processing the rest of the batch. Most YouTube videos have auto-generated captions available.
How accurate is the AI analysis? Structure detection aligns within 5-10 seconds of actual transitions. Retention technique identification catches 80-90% of intentional hooks. Recreation prompts produce scripts that match the original's pacing, style, and engagement patterns.
Can I use the recreation prompts with ChatGPT, Claude, or other AI? Yes. The recreation prompt is designed to be pasted directly into any AI assistant to generate a new script matching the original's structural formula while adapting to your topic.
How long does analysis take? Full depth: 1-3 minutes per video. Premium depth: 3-7 minutes per video depending on length. The actor handles videos up to 60 minutes.
What's the frame-by-frame visual analysis? Premium tier extracts screenshots at regular intervals and analyzes each frame for shot composition, camera work, lighting, text overlays, visual effects, and transitions. These are synthesized into a Visual Style Profile for consistent visual recreation.
Integration and API
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_TOKEN' });const run = await client.actor('brilliant_gum/youtube-video-reverse-engineer').call({urls: ['https://youtube.com/watch?v=fMfipiV_17o'],analysisDepth: 'full',recreationTopic: 'tech product reviews',});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].recreation.recreation_prompt);// → "Create a 14-minute rapid-fire challenge compilation..."console.log(items[0].recreation.title_formulas);// → ["Would You [Action] For $[Amount]?", ...]
Works with Apify webhooks — send results to Zapier, Make, n8n, or your own endpoint when analysis completes.
Related tools
- YouTube Transcript Scraper — If you only need raw transcripts without analysis
- YouTube Channel Analyzer — For channel-level metrics and growth analytics