Script Architect
Pricing
from $50.00 / 1,000 script generateds
Script Architect
AI-powered script writer for TikTok, Instagram Reels & YouTube Shorts. Generates viral short-form video scripts with hooks, scene breakdowns & captions — powered by Gemini 2.0 Flash + LangGraph.
Pricing
from $50.00 / 1,000 script generateds
Rating
0.0
(0)
Developer
Rahul Agarwal
Actor stats
0
Bookmarked
4
Total users
3
Monthly active users
18 days ago
Last modified
Categories
Share
🎬 Script Architect
AI agent that researches trending short-form content and generates structured video scripts with scene-by-scene visual prompts for Instagram Reels, TikTok, and YouTube Shorts.
Built on Apify + LangGraph + Gemini 2.0 Flash.
How It Works
- Scrapes trending content for your topic via Apify sub-actors (TikTok/Instagram)
- Analyzes patterns — hook styles, engagement metrics, hashtag clusters, video durations
- Generates a structured script with narrative arc enforcement, timed scenes, and AI-ready visual prompts
- Validates everything — Pydantic cross-validators enforce duration math, scene sequencing, and visual prompt quality
Input
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
topic | string | ✅ | — | A specific content angle (10–200 chars). Bad: "AI". Good: "Why AI tools make junior devs slower before they make them faster" |
targetAudience | string | null | Who is this for? Include role + pain point. Example: "Junior devs overwhelmed by AI coding assistants" | |
contentGoal | enum | awareness | awareness · education · conversion · entertainment · thought_leadership | |
platform | enum | INSTAGRAM_REEL | INSTAGRAM_REEL · TIKTOK · YOUTUBE_SHORT | |
targetDuration | integer | 45 | Target duration in seconds (15–60) | |
tone | enum | contrarian | contrarian · storyteller · insider · skeptic · teacher · agitator · auto | |
hookStyle | enum | auto | contradiction · stat · story · question · bold_claim · curiosity_gap · auto | |
visualStyle | enum | auto | talking_head · b_roll_cinematic · screen_recording · motion_graphics · mixed · auto | |
referenceCreators | array | [] | Up to 3 Instagram/TikTok usernames to study (e.g. ["hubermanlab", "garyvee"]) | |
brandVoice | string | null | 1–2 sentence voice description. Example: "Direct and technical, like a senior engineer mentoring over coffee" | |
avoidTopics | array | [] | Topics to never mention (e.g. ["competitor names", "pricing"]) | |
contentPurpose | enum | standalone | standalone · series_episode · repurposed_longform | |
outputLanguage | string | en | ISO 639-1 language code | |
debug | boolean | false | Verbose logging + saves prompt/output to key-value store |
Example Input
{"topic": "Why AI tools make junior devs slower before they make them faster","targetAudience": "Junior developers overwhelmed by AI coding assistants","contentGoal": "thought_leadership","platform": "TIKTOK","targetDuration": 45,"tone": "contrarian","hookStyle": "bold_claim","visualStyle": "b_roll_cinematic","brandVoice": "Direct and technical but accessible. Like a senior engineer mentoring over coffee.","avoidTopics": ["specific tool names", "pricing"]}
Output
The agent produces a structured JSON script pushed to the Apify dataset:
| Field | Type | Description |
|---|---|---|
hook | string | Opening line for the first 3 seconds |
hook_type | enum | Hook formula used (bold_claim, question, contradiction, etc.) |
scenes | array | Scene-by-scene breakdown (see below) |
total_duration | float | Total script duration in seconds |
platform | enum | Target platform |
visual_style | string | Global visual treatment applied to all scenes |
tone_used | enum | Tone archetype applied |
content_strategy | string | One-line strategy summary |
trending_patterns_used | array | Patterns borrowed from trend research |
cta | string | Call-to-action text |
music_mood | string | Suggested music mood (optional) |
Scene Structure
Each scene in the scenes array contains:
| Field | Type | Description |
|---|---|---|
id | int | Sequential scene number (1, 2, 3, ...) |
role | enum | Narrative function: hook · context · conflict · revelation · cta |
text | string | Spoken/displayed script text (10–200 chars) |
visual_prompt | string | AI image/video generation prompt (60–300 chars) — includes subject, environment, lighting, camera angle, and 9:16 framing |
negative_prompt | string | What to exclude from visuals (optional) |
duration_target | float | Scene duration in seconds (2–10s) |
Example Output
{"hook": "AI makes you slower.","hook_type": "bold_claim","scenes": [{"id": 1,"role": "hook","text": "AI makes you slower.","visual_prompt": "Close-up of hands struggling to type, surrounded by error messages, harsh red lighting, shallow depth of field, cinematic, 9:16 vertical frame, mobile-optimized composition","negative_prompt": null,"duration_target": 3.0},{"id": 2,"role": "context","text": "At first, anyway.","visual_prompt": "Abstract animation of AI learning, lines of code forming and reforming, dark background with bright accents, futuristic, 9:16 vertical frame, mobile-optimized composition","negative_prompt": null,"duration_target": 4.0},{"id": 3,"role": "conflict","text": "...but no understanding.","visual_prompt": "Junior dev staring blankly at code, confused expression, face illuminated by screen, desaturated colors, 9:16 vertical frame, mobile-optimized composition","negative_prompt": null,"duration_target": 6.0},{"id": 4,"role": "revelation","text": "Learn fundamentals first.","visual_prompt": "Hands methodically writing code, syntax highlighting, warm lighting, focused, 9:16 vertical frame, mobile-optimized composition","negative_prompt": null,"duration_target": 6.0},{"id": 5,"role": "cta","text": "Agree or disagree?","visual_prompt": "Developer leaning back, thoughtful expression, screen reflecting, soft lighting, 9:16 vertical frame, mobile-optimized composition","negative_prompt": null,"duration_target": 6.0}],"total_duration": 44.0,"platform": "TIKTOK","visual_style": "Dark, moody cinematic B-roll footage. Focus on hands coding, screens glowing, and abstract algorithms. Grainy, desaturated color grading, 9:16 vertical frame","tone_used": "contrarian","content_strategy": "Challenge the idea that AI is always beneficial for junior devs.","trending_patterns_used": ["Contrarian viewpoint to spark debate","Problem-Agitate-Solve narrative structure"],"cta": "Agree or disagree?","music_mood": null}
Validation Rules
The agent enforces these constraints on every output:
- ✅ Narrative arc: First scene =
hook, last scene =cta, at least oneconflictorrevelationin between - ✅ Duration math:
total_durationmust match sum of scene durations (±3s) - ✅ Scene IDs: Must be unique and sequential
[1, 2, 3, ...] - ✅ Visual prompts: 60+ characters with subject, lighting, camera angle, and 9:16 framing
- ✅ Target duration: Output checked against your
targetDurationinput (±5s tolerance) - ✅ Speaking rate: Scene text validated against duration at language-appropriate words-per-minute
Running Locally
-
Set your Google API key:
$export GOOGLE_API_KEY=your_key_here -
Run with the test input:
$apify runThe default input is in
storage/key_value_stores/default/INPUT.json.
Deploying to Apify
-
Push to Apify:
$apify push -
Set
GOOGLE_API_KEYin the actor's Environment Variables on Apify Console. -
Run from the Console or via API.
Pricing (Pay-Per-Event)
| Event | Price | Trigger |
|---|---|---|
| Script Generated | $0.05 | Per complete script pushed to dataset |
| Trending Research | $0.02 | Per trending content search |
| Creator Analysis | $0.03 | Per creator profile analyzed |
Tech Stack
- Runtime: Python 3.14 on Apify
- LLM: Google Gemini 2.0 Flash
- Agent Framework: LangGraph (ReAct pattern)
- Schema Validation: Pydantic v2
- Data Sources: Apify sub-actors (
clockworks/tiktok-scraper,apify/instagram-hashtag-scraper,apify/instagram-scraper)