This workflow corresponds to n8n.io template #13675 — we link there as the canonical source.
This workflow follows the Form Trigger → Googlegemini recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"id": "VqxKMC7FYTLF6vFdAkNx0",
"meta": {
"templateCredsSetupCompleted": true
},
"name": "Video Digestion Workflow",
"tags": [],
"nodes": [
{
"id": "d58b7efa-9d69-45ec-b522-0c81a6063a29",
"name": "Setup Instructions",
"type": "n8n-nodes-base.stickyNote",
"position": [
1904,
4912
],
"parameters": {
"color": 4,
"width": 300,
"height": 340,
"content": "## YouTube Shorts Automation v7.0\n### @adamfreelances\n\n**APIFY YOUTUBE APPROACH**\n\nSimplified workflow:\n1. Form takes YouTube URL\n2. Apify downloads video\n3. Apify gets transcript\n4. Gemini detects meaningful moments\n5. Returns time ranges + descriptions\n\n**Output per moment:**\n- Start/end timestamps\n- App name\n- Descriptive action"
},
"typeVersion": 1
},
{
"id": "46a168f3-a4f6-4634-94df-5d3f8835ab32",
"name": "Stage 1",
"type": "n8n-nodes-base.stickyNote",
"position": [
2208,
4400
],
"parameters": {
"color": 5,
"width": 300,
"height": 180,
"content": "## Stage 1: YouTube URL Input\n\n**Trigger:** Form submission\n\n1. User submits YouTube URL\n2. Apify downloads video\n3. Apify extracts transcript\n4. Combine data for processing"
},
"typeVersion": 1
},
{
"id": "7f5458dc-6dbc-403c-9104-8c1900dd2eb5",
"name": "Stage 2A - Transcript",
"type": "n8n-nodes-base.stickyNote",
"position": [
2336,
4912
],
"parameters": {
"color": 6,
"width": 320,
"height": 180,
"content": "## Stage 2A: Transcript\n\nApify YouTube Transcript:\n- Gets transcript directly from YT\n- No audio processing needed\n- Fast, reliable transcription\n\nReturns full transcript text."
},
"typeVersion": 1
},
{
"id": "d055a42b-24dc-4a25-8cca-1cb417d3b47f",
"name": "YouTube URL Form",
"type": "n8n-nodes-base.formTrigger",
"position": [
2000,
4656
],
"parameters": {
"path": "youtube-form-trigger",
"options": {},
"formTitle": "YouTube Video Processor",
"formFields": {
"values": [
{
"fieldLabel": "YouTube URL",
"placeholder": "https://www.youtube.com/watch?v=...",
"requiredField": true
}
]
},
"formDescription": "Submit a YouTube video URL for automated shorts creation"
},
"typeVersion": 2.1
},
{
"id": "89bb693e-17e5-44e5-ad8c-6d342db43dbb",
"name": "Apify YouTube Downloader",
"type": "@apify/n8n-nodes-apify.apify",
"position": [
2224,
4656
],
"parameters": {
"actorId": {
"__rl": true,
"mode": "list",
"value": "y1IMcEPawMQPafm02",
"cachedResultUrl": "https://console.apify.com/actors/y1IMcEPawMQPafm02/input",
"cachedResultName": "Youtube Video Downloader (epctex/youtube-video-downloader)"
},
"operation": "Run actor and get dataset",
"customBody": "={\n \"includeFailedVideos\": false,\n \"maxRequestRetries\": 2,\n \"proxy\": {\n \"useApifyProxy\": true,\n \"apifyProxyGroups\": [\n \"RESIDENTIAL\"\n ]\n },\n \"quality\": \"720\",\n \"startUrls\": [\n \"{{ $json['YouTube URL'] }}\"\n ],\n \"useFfmpeg\": false\n}\n"
},
"credentials": {
"apifyApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "9e5a7510-d65a-4ae8-9dc9-cbc62dae7f49",
"name": "Apify YouTube Transcript",
"type": "@apify/n8n-nodes-apify.apify",
"position": [
2448,
4656
],
"parameters": {
"actorId": {
"__rl": true,
"mode": "list",
"value": "Uwpce1RSXlrzF6WBA",
"cachedResultUrl": "https://console.apify.com/actors/Uwpce1RSXlrzF6WBA/input",
"cachedResultName": "YouTube Video Transcript (starvibe/youtube-video-transcript)"
},
"operation": "Run actor and get dataset",
"customBody": "={\n \"include_transcript_text\": true,\n \"language\": \"en\",\n \"youtube_url\": \"{{ $('YouTube URL Form').item.json['YouTube URL'] }}\"\n}"
},
"credentials": {
"apifyApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "4503cec3-8aad-400c-9c68-8d8ffc85de16",
"name": "Set Video Data",
"type": "n8n-nodes-base.code",
"position": [
2672,
4656
],
"parameters": {
"jsCode": "// ============================================================================\n// SET VIDEO DATA - Consolidate Apify outputs\n// ============================================================================\n// Combines data from both Apify nodes (downloader + transcript)\n// ============================================================================\n\nconst formData = $('YouTube URL Form').first().json;\nconst downloaderData = $('Apify YouTube Downloader').first().json;\nconst transcriptData = $input.first().json;\n\n// Extract video URL from downloader\n// Downloader returns: { sourceUrl, downloadUrl }\nconst videoUrl = downloaderData.downloadUrl || '';\nconst sourceUrl = downloaderData.sourceUrl || formData['YouTube URL'];\n\n// Extract video info from transcript node (it has all the metadata)\nconst videoId = transcriptData.message?.match(/ID '([^']+)'/)?.[1] || sourceUrl.match(/[?&]v=([^&]+)/)?.[1] || '';\nconst videoTitle = transcriptData.title || 'Unknown Video';\nconst videoDuration = transcriptData.duration_seconds || 0;\nconst channelTitle = transcriptData.channel_name || '';\nconst videoDescription = transcriptData.description;\nconst channelId = transcriptData.channel_id || '';\nconst thumbnail = transcriptData.thumbnail || '';\nconst publishedAt = transcriptData.published_at || '';\n\n// Extract transcript from Apify response\nlet transcriptText = '';\nlet transcriptSegments = [];\n\nif (transcriptData.transcript && Array.isArray(transcriptData.transcript)) {\n // Apify returns array of {text, start, duration}\n transcriptSegments = transcriptData.transcript.map((seg, index) => ({\n index,\n text: seg.text || '',\n start: seg.start || 0,\n end: (seg.start || 0) + (seg.duration || 0),\n start_formatted: formatTime(seg.start || 0),\n end_formatted: formatTime((seg.start || 0) + (seg.duration || 0))\n }));\n transcriptText = transcriptSegments.map(s => s.text).join(' ');\n}\n\nfunction formatTime(seconds) {\n const mins = Math.floor(seconds / 60);\n const secs = Math.floor(seconds % 60);\n return `${mins}:${secs.toString().padStart(2, '0')}`;\n}\n\nreturn [{\n json: {\n // Source info\n youtube_url: sourceUrl,\n video_id: videoId,\n video_name: videoTitle,\n video_url: videoUrl,\n channel_title: channelTitle,\n channel_id: channelId,\n thumbnail: thumbnail,\n published_at: publishedAt,\n \n // Duration\n duration: videoDuration,\n duration_formatted: formatTime(videoDuration),\n description: videoDescription,\n \n // Transcript data\n transcript_text: transcriptText,\n segments: transcriptSegments,\n segment_count: transcriptSegments.length,\n \n // Extra metadata from transcript node\n like_count: transcriptData.like_count || 0,\n comment_count: transcriptData.comment_count || 0,\n subscriber_count: transcriptData.subscriber_count || 0,\n is_auto_generated: transcriptData.is_auto_generated || false,\n \n // Metadata\n timestamp: new Date().toISOString()\n }\n}];"
},
"typeVersion": 2
},
{
"id": "b619e38f-7b4a-49c1-b61e-1b03b99def96",
"name": "Parse Key Actions",
"type": "n8n-nodes-base.code",
"position": [
3136,
4656
],
"parameters": {
"jsCode": "// ============================================================================\n// PARSE KEY ACTIONS v4.0 (Webcam Crop Support)\n// ============================================================================\n// - Parses Gemini response with new webcam detection schema\n// - Filters out talking-head-only clips (no usable screen content)\n// - KEEPS webcam overlay clips that have usable screen content + crop data\n// - Categorizes clips for downstream Creatomate processing\n// ============================================================================\n\nconst geminiOutput = $input.first().json;\nconst videoData = $('Set Video Data').first().json;\n\n// ============================================================================\n// PARSE GEMINI RESPONSE\n// ============================================================================\n\nlet keyMoments = [];\nlet responseText = '';\n\ntry {\n // Handle different Gemini response structures\n if (Array.isArray(geminiOutput)) {\n responseText = geminiOutput[0]?.content?.parts?.[0]?.text || '';\n } else if (geminiOutput.content?.parts) {\n responseText = geminiOutput.content.parts[0]?.text || '';\n } else if (typeof geminiOutput === 'string') {\n responseText = geminiOutput;\n } else if (geminiOutput.text) {\n responseText = geminiOutput.text;\n } else if (geminiOutput.output) {\n const output = geminiOutput.output;\n if (Array.isArray(output) && output[0]?.content?.parts) {\n responseText = output[0].content.parts[0]?.text || '';\n } else if (Array.isArray(output) && output[0]?.content) {\n responseText = output[0].content[0]?.text || JSON.stringify(output);\n } else {\n responseText = typeof output === 'string' ? output : JSON.stringify(output);\n }\n } else {\n responseText = JSON.stringify(geminiOutput);\n }\n \n // Clean markdown formatting\n responseText = responseText\n .replace(/```json\\n?/gi, '')\n .replace(/```\\n?/gi, '')\n .trim();\n \n keyMoments = JSON.parse(responseText);\n \n if (!Array.isArray(keyMoments)) {\n throw new Error('Response is not a JSON array');\n }\n \n} catch (parseError) {\n throw new Error(`FATAL: Failed to parse Gemini response - ${parseError.message}`);\n}\n\n// ============================================================================\n// VALIDATE: Must have at least 1 clip\n// ============================================================================\n\nif (keyMoments.length === 0) {\n throw new Error('FATAL: Gemini returned 0 key moments. Video may not contain usable B-roll.');\n}\n\nconsole.log(`\u2713 Parsed ${keyMoments.length} key moments from Gemini`);\n\n// ============================================================================\n// CATEGORIZE CLIPS BY USABILITY\n// ============================================================================\n\nconst totalClips = keyMoments.length;\n\n// Category 1: Clean screen recordings (no person visible)\nconst cleanScreenClips = keyMoments.filter(m => m.person_visible !== true);\n\n// Category 2: Webcam overlay clips WITH usable screen content (can be cropped)\nconst webcamOverlayClips = keyMoments.filter(m => \n m.person_visible === true && \n m.has_usable_screen_content === true &&\n m.safe_crop_zone !== null &&\n m.creatomate_crop !== null\n);\n\n// Category 3: Talking head only (NOT usable - will be filtered out)\nconst talkingHeadClips = keyMoments.filter(m => \n m.person_visible === true && \n (m.has_usable_screen_content === false || m.has_usable_screen_content === undefined)\n);\n\n// Category 4: Webcam overlay but MISSING crop data (unusable due to incomplete data)\nconst incompleteWebcamClips = keyMoments.filter(m =>\n m.person_visible === true &&\n m.has_usable_screen_content === true &&\n (m.safe_crop_zone === null || m.creatomate_crop === null)\n);\n\nconsole.log(`\ud83d\udcca Clip Categorization:`);\nconsole.log(` - Clean screen recordings: ${cleanScreenClips.length}`);\nconsole.log(` - Webcam overlay (croppable): ${webcamOverlayClips.length}`);\nconsole.log(` - Talking head only (filtered): ${talkingHeadClips.length}`);\nconsole.log(` - Incomplete webcam data (filtered): ${incompleteWebcamClips.length}`);\n\n// ============================================================================\n// COMBINE USABLE CLIPS\n// ============================================================================\n\n// Usable clips = clean screen + webcam overlay with valid crop data\nconst usableClips = [...cleanScreenClips, ...webcamOverlayClips];\n\n// Sort by start_seconds to maintain chronological order\nusableClips.sort((a, b) => (a.start_seconds || 0) - (b.start_seconds || 0));\n\n// Add processing metadata to each clip\nconst processedClips = usableClips.map((clip, index) => ({\n ...clip,\n // Add processing flags\n _processing: {\n clip_index: index,\n requires_crop: clip.person_visible === true && clip.has_usable_screen_content === true,\n crop_strategy: clip.safe_crop_zone?.strategy || null,\n content_preserved_percent: clip.safe_crop_zone?.content_preserved_percent || 100\n }\n}));\n\nconst filteredOutCount = talkingHeadClips.length + incompleteWebcamClips.length;\nconst clipsRequiringCrop = processedClips.filter(c => c._processing.requires_crop).length;\n\nconsole.log(`\u2713 Total usable clips: ${processedClips.length}`);\nconsole.log(`\u2713 Clips requiring crop: ${clipsRequiringCrop}`);\nconsole.log(`\u2713 Clips filtered out: ${filteredOutCount}`);\n\n// ============================================================================\n// VALIDATE: Must have clips AFTER filtering\n// ============================================================================\n\nif (processedClips.length === 0) {\n const errorDetails = [\n `Total clips from Gemini: ${totalClips}`,\n `Talking head clips (no screen content): ${talkingHeadClips.length}`,\n `Webcam clips with missing crop data: ${incompleteWebcamClips.length}`,\n `Clean screen clips: ${cleanScreenClips.length}`,\n `Croppable webcam clips: ${webcamOverlayClips.length}`\n ].join('. ');\n \n throw new Error(`FATAL: No usable B-roll clips after filtering. ${errorDetails}`);\n}\n\n// ============================================================================\n// QUICK STATS\n// ============================================================================\n\nconst uniqueApps = [...new Set(processedClips.map(m => m.app).filter(Boolean))];\nconst totalDuration = processedClips.reduce((sum, m) => {\n return sum + ((m.end_seconds || 0) - (m.start_seconds || 0));\n}, 0);\n\n// Webcam position distribution (for clips that have webcam)\nconst webcamPositions = webcamOverlayClips.reduce((acc, clip) => {\n const pos = clip.webcam_region?.position || 'unknown';\n acc[pos] = (acc[pos] || 0) + 1;\n return acc;\n}, {});\n\nconsole.log(`Apps detected: ${uniqueApps.join(', ') || 'none'}`);\nconsole.log(`Total action time: ${totalDuration.toFixed(0)}s`);\nif (Object.keys(webcamPositions).length > 0) {\n console.log(`Webcam positions: ${JSON.stringify(webcamPositions)}`);\n}\n\n// ============================================================================\n// RETURN\n// ============================================================================\n\nreturn [{\n json: {\n youtube_url: videoData.youtube_url,\n video_id: videoData.video_id,\n video_name: videoData.video_name,\n video_url: videoData.video_url,\n videoDescription: videoData.description,\n channel_title: videoData.channel_title,\n transcript_text: videoData.transcript_text,\n duration: videoData.duration,\n segments: videoData.segments,\n key_moments: processedClips,\n visual_analysis: {\n // Counts\n total_from_gemini: totalClips,\n usable_clips_count: processedClips.length,\n clean_screen_clips: cleanScreenClips.length,\n webcam_overlay_clips: webcamOverlayClips.length,\n clips_requiring_crop: clipsRequiringCrop,\n \n // Filtered out\n filtered_out_total: filteredOutCount,\n filtered_talking_head: talkingHeadClips.length,\n filtered_incomplete_data: incompleteWebcamClips.length,\n \n // Metadata\n unique_apps: uniqueApps,\n webcam_positions: webcamPositions,\n total_action_time_seconds: totalDuration,\n video_duration_seconds: videoData.duration,\n analyzed_at: new Date().toISOString()\n }\n }\n}];"
},
"typeVersion": 2
},
{
"id": "1c6080c7-750f-4141-a9e2-9941f96a1a2f",
"name": "Stage 2B - Visual1",
"type": "n8n-nodes-base.stickyNote",
"position": [
2768,
4912
],
"parameters": {
"color": 3,
"width": 340,
"height": 340,
"content": "## Stage 2B: Visual Analysis\n\n**KEY ACTION DETECTION v2**\n\nGemini watches the whole video and\nidentifies MEANINGFUL moments:\n\n- What the user accomplished\n- Time ranges (start \u2192 end)\n- Descriptive actions, not clicks\n\nExamples:\n- \"User created a Code node and\n connected it to the trigger\"\n- \"User typed a prompt into Claude\n asking for API integration help\""
},
"typeVersion": 1
},
{
"id": "723c1460-12da-41fc-b88e-612735eb388b",
"name": "AI Section Analyzer1",
"type": "@n8n/n8n-nodes-langchain.openAi",
"position": [
3408,
4656
],
"parameters": {
"modelId": {
"__rl": true,
"mode": "list",
"value": "gpt-5.2",
"cachedResultName": "GPT-5.2"
},
"options": {
"textFormat": {
"textOptions": {
"type": "json_schema",
"schema": "{\n \"type\": \"object\",\n \"properties\": {\n \"video_overview\": {\n \"type\": \"object\",\n \"properties\": {\n \"summary\": {\n \"type\": \"string\",\n \"description\": \"2-3 sentence summary of what the video covers\"\n },\n \"one_liner\": {\n \"type\": \"string\",\n \"description\": \"Single punchy sentence describing the video\"\n },\n \"main_argument\": {\n \"type\": \"string\",\n \"description\": \"The core thesis or main point being made\"\n },\n \"target_audience\": {\n \"type\": \"string\",\n \"description\": \"Who this video is for\"\n },\n \"content_style\": {\n \"type\": \"string\",\n \"description\": \"e.g., tutorial, walkthrough, explanation, case study\"\n },\n \"tone\": {\n \"type\": \"string\",\n \"description\": \"e.g., technical, conversational, educational\"\n },\n \"key_takeaways\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"3-5 main things viewers will learn\"\n },\n \"problems_addressed\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"Pain points or challenges the video solves\"\n },\n \"tools_mentioned\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"Software, platforms, or tools discussed\"\n },\n \"frameworks_explained\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"Methodologies, frameworks, or approaches taught\"\n },\n \"suggested_titles\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"3-5 alternative YouTube title options\"\n },\n \"seo_keywords\": {\n \"type\": \"array\",\n \"items\": { \"type\": \"string\" },\n \"description\": \"10-15 keywords for SEO and discoverability\"\n }\n },\n \"additionalProperties\": false,\n \"required\": [\n \"summary\",\n \"one_liner\",\n \"main_argument\",\n \"target_audience\",\n \"content_style\",\n \"tone\",\n \"key_takeaways\",\n \"problems_addressed\",\n \"tools_mentioned\",\n \"frameworks_explained\",\n \"suggested_titles\",\n \"seo_keywords\"\n ]\n }\n },\n \"additionalProperties\": false,\n \"required\": [\"video_overview\"]\n}"
}
}
},
"responses": {
"values": [
{
"role": "system",
"content": "=You are a video content analyzer for @adamfreelances, a technical YouTube channel focused on AI development, n8n automation, and building custom CRM systems.\n\nYour job is to analyze video transcripts to extract comprehensive metadata for:\n- Content repurposing\n- SEO optimization\n- Video understanding\n\nAlways maintain the brand voice: technical, educational, anti-guru (no hype, real implementation focus).\n\nRespond ONLY with valid JSON matching the required schema. No markdown, no explanations, just the JSON object."
},
{
"content": "=Analyze this video content and extract key metadata.\n\n## TRANSCRIPT\n{{ $json.transcript_text }}\n\n## KEY MOMENTS (Visual Analysis)\nThese are the on-screen actions detected by visual analysis:\n{{ $json.key_moments.toJsonString() }}\n\n## VIDEO METADATA\n- Duration: {{ $json.duration }} seconds\n- Segment count: {{ $json.segments.length }}\n- Video name: {{ $json.video_name }}\n- Video Description: {{ $('Set Video Data').item.json.description }}\n\n## INSTRUCTIONS\n1. Divide the content into 3-8 logical sections based on topic changes\n2. Extract a comprehensive video overview with summary, key takeaways, and SEO data\n3. Identify the best sections that could work as standalone YouTube Shorts (15-60 seconds)\n4. List all tools, frameworks, and technologies mentioned\n5. Generate suggested titles and SEO keywords\n\nReturn the structured JSON analysis."
}
]
},
"builtInTools": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 2.1
},
{
"id": "f90b2679-31fd-403c-adc8-59fc9c35c443",
"name": "Call 'Shorts Creation'1",
"type": "n8n-nodes-base.executeWorkflow",
"position": [
4224,
4656
],
"parameters": {
"options": {
"waitForSubWorkflow": false
},
"workflowId": {
"__rl": true,
"mode": "list",
"value": "nEzi6P6Sf5b5Fepd",
"cachedResultUrl": "/workflow/nEzi6P6Sf5b5Fepd",
"cachedResultName": "Short Creation (HUMAN)"
},
"workflowInputs": {
"value": {},
"schema": [],
"mappingMode": "defineBelow",
"matchingColumns": [],
"attemptToConvertTypes": false,
"convertFieldsToString": true
}
},
"typeVersion": 1.3
},
{
"id": "743ac45d-5f3f-45a0-a3cc-f624fa4e02b8",
"name": "Analyse Video",
"type": "@n8n/n8n-nodes-langchain.googleGemini",
"position": [
2896,
4656
],
"parameters": {
"text": "=Analyze this ENTIRE video from start to finish and find 3-second B-roll clips with maximum visual movement. You MUST scan the complete video duration, not just the intro.\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nPERSON VISIBILITY & WEBCAM OVERLAY DETECTION\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nThis prompt handles TWO distinct scenarios:\n\nSCENARIO A: FULL-FRAME PERSON (person_visible: true, has_usable_screen_content: false)\n- Talking head fills most/all of the frame\n- No significant screen recording content behind them\n- These clips are NOT usable for B-roll\n- Mark as: person_visible: true, has_usable_screen_content: false\n\nSCENARIO B: SCREEN RECORDING WITH WEBCAM OVERLAY (person_visible: true, has_usable_screen_content: true)\n- Main content is screen recording (code editor, browser, app UI, etc.)\n- Small webcam overlay shows person's face in a corner\n- The screen content IS usable if we crop out the webcam region\n- Mark as: person_visible: true, has_usable_screen_content: true\n- MUST provide webcam_region and safe_crop_zone data\n\nSCENARIO C: CLEAN SCREEN RECORDING (person_visible: false)\n- Pure screen recording with no person visible anywhere\n- Ideal B-roll - no cropping needed\n- Mark as: person_visible: false, has_usable_screen_content: true\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n5-SECOND BUFFER CHECK (CRITICAL FOR ALL SCENARIOS)\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nYou are extracting 3-second clips, but you MUST verify a 5-SECOND WINDOW for consistency.\nFor a clip timestamped start to end, check this EXPANDED range:\n\nBUFFER START = start minus 1 second\nBUFFER END = end plus 1 second\n\nCheck these 5 points within the 5-second buffer zone:\n\n1. BUFFER START (-1 sec before clip starts)\n2. CLIP START (actual start of the 3-sec clip)\n3. CLIP MIDDLE (1.5 seconds into the clip)\n4. CLIP END (actual end of the 3-sec clip)\n5. BUFFER END (+1 sec after clip ends)\n\nExample: For a clip at 0:10-0:13, check: 0:09, 0:10, 0:11.5, 0:13, 0:14\n\nFor SCENARIO B (webcam overlay), verify the webcam stays in the SAME position across all 5 checkpoints.\nIf the webcam moves or disappears mid-clip, the safe_crop_zone is invalid.\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\u26a0\ufe0f PERSON DETECTION ACCURACY - READ CAREFULLY \u26a0\ufe0f\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nWHAT COUNTS AS \"PERSON VISIBLE\" - BE EXTREMELY STRICT:\n\u2717 Full face on screen = PERSON VISIBLE\n\u2717 Partial face (forehead, chin, cheek, ear) = PERSON VISIBLE\n\u2717 Side profile of face = PERSON VISIBLE\n\u2717 Back of head = PERSON VISIBLE\n\u2717 Webcam overlay in corner (ANY size, even tiny) = PERSON VISIBLE\n\u2717 Hands typing on physical keyboard = PERSON VISIBLE\n\u2717 Arms or shoulders visible = PERSON VISIBLE\n\u2717 Body/torso visible = PERSON VISIBLE\n\u2717 Reflection of person in screen = PERSON VISIBLE\n\u2717 Picture-in-picture with person = PERSON VISIBLE\n\u2717 Blurred/transitioning person (fading in/out) = PERSON VISIBLE\n\u2717 Person visible for even a SINGLE FRAME within buffer = PERSON VISIBLE\n\u2717 Silhouette or shadow of person = PERSON VISIBLE\n\nONLY mark person_visible: false when you are 100% CERTAIN:\n\u2713 Pure screen recording with only cursor/typing on screen\n\u2713 Software UI with absolutely no human elements\n\u2713 Animated graphics/slides with no person\n\u2713 Terminal/code editor with no webcam overlay anywhere\n\u2713 Browser/app interface with no picture-in-picture\n\nWHEN IN DOUBT, MARK person_visible: true\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nWEBCAM REGION DETECTION (FOR SCENARIO B ONLY)\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nWhen a webcam overlay is present over usable screen content, you MUST:\n\n1. IDENTIFY WEBCAM CORNER POSITION\n - \"top-left\", \"top-right\", \"bottom-left\", \"bottom-right\", or \"custom\"\n \n2. MEASURE WEBCAM BOUNDING BOX (normalized 0-1 coordinates)\n All values are percentages of the total frame expressed as decimals (0.0 to 1.0)\n \n webcam_region: {\n x_min: [left edge, 0.0 = left side of frame],\n y_min: [top edge, 0.0 = top of frame],\n x_max: [right edge, 1.0 = right side of frame],\n y_max: [bottom edge, 1.0 = bottom of frame]\n }\n\n3. CALCULATE SAFE CROP ZONE\n The largest RECTANGULAR region that completely EXCLUDES the webcam.\n This will be used by Creatomate to render only the clean portion.\n \n For corner webcams, the safe zone is typically the OPPOSITE strip:\n - Webcam bottom-right \u2192 safe zone is everything ABOVE or to the LEFT\n - Webcam top-left \u2192 safe zone is everything BELOW or to the RIGHT\n \n Choose the crop that preserves the MOST screen content while fully excluding the webcam.\n\n4. ADD PADDING BUFFER\n Add 2% padding around the webcam region to ensure no face pixels leak through.\n If webcam_region.x_min = 0.75, use 0.73 for safe_crop_zone.x_max\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nSAFE CROP ZONE CALCULATION EXAMPLES\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nEXAMPLE 1: Webcam in BOTTOM-RIGHT corner\nFrame: 1920x1080, Webcam occupies bottom-right ~25% width, ~30% height\n\nwebcam_region: {\n x_min: 0.75, // starts at 75% from left\n y_min: 0.70, // starts at 70% from top\n x_max: 1.0, // extends to right edge\n y_max: 1.0 // extends to bottom edge\n}\n\nOPTION A - Crop to TOP portion (preserves full width):\nsafe_crop_zone: {\n x_min: 0.0,\n y_min: 0.0,\n x_max: 1.0,\n y_max: 0.68 // 2% padding above webcam\n}\nResult: Full 1920px width, top 68% of height (734px)\n\nOPTION B - Crop to LEFT portion (preserves full height):\nsafe_crop_zone: {\n x_min: 0.0,\n y_min: 0.0,\n x_max: 0.73, // 2% padding left of webcam\n y_max: 1.0\n}\nResult: Left 73% of width (1402px), full 1080px height\n\nChoose based on which preserves more important content. Generally prefer OPTION A for horizontal screen recordings.\n\n\nEXAMPLE 2: Webcam in TOP-LEFT corner\nFrame: 1920x1080, Webcam occupies top-left ~20% width, ~25% height\n\nwebcam_region: {\n x_min: 0.0,\n y_min: 0.0,\n x_max: 0.20,\n y_max: 0.25\n}\n\nsafe_crop_zone: {\n x_min: 0.0,\n y_min: 0.27, // 2% padding below webcam\n x_max: 1.0,\n y_max: 1.0\n}\nResult: Full width, bottom 73% of height\n\n\nEXAMPLE 3: Webcam in BOTTOM-LEFT corner (circular overlay)\nFrame: 1920x1080, Circular webcam overlay\n\nwebcam_region: {\n x_min: 0.02,\n y_min: 0.65,\n x_max: 0.22,\n y_max: 0.98\n}\n\nsafe_crop_zone: {\n x_min: 0.0,\n y_min: 0.0,\n x_max: 1.0,\n y_max: 0.63\n}\nResult: Full width, top 63% of height\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nQUANTITY REQUIREMENTS\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nMAXIMIZE CLIPS - Extract as many qualifying B-roll clips as possible\n- For videos under 10 minutes: Find 50-100 clips\n- For videos 10-20 minutes: Find 100-150 clips\n- For videos 20+ minutes: Find 150-250+ clips\n\n3-SECOND CLIPS = MORE CLIPS - Shorter duration means you can capture more moments\n\nIf you're finding fewer than 50, LOWER your threshold - include clips with moderate movement\n\nBetter to include a borderline clip than miss good B-roll\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nFULL VIDEO ANALYSIS REQUIREMENT\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nAnalyze from 0:00 to the very end\n\n- Intros often have more animation, but the BODY of the video contains valuable B-roll too\n- Specifically look for clips in the MIDDLE and LATER portions (after 2:00)\n- Aim for clips distributed EVENLY across the video timeline, not clustered at the start\n- Break the video into quarters and find clips in EACH quarter\n- LEAVE NO GAPS - If there's 30+ seconds without a clip, look harder at that section\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nWHAT MAKES GREAT B-ROLL (3 seconds)\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nHIGH PRIORITY - Lots of visual movement:\n- Rapid typing with code appearing on screen\n- Dragging and dropping nodes/elements\n- Drawing connections between nodes\n- UI elements animating or expanding\n- Terminal output scrolling rapidly\n- Files/folders being created or moved\n- Cursor moving purposefully across screen\n- Workflow executing with visible progress\n- Code being highlighted or selected\n- Browser navigation with page loads\n- Switching between applications\n- Form filling with visible text entry\n- API responses populating on screen\n- Clicking through menus or dropdowns\n- Selecting options or checkboxes\n- Copy/paste actions with visible results\n\nMEDIUM PRIORITY - Include to maximize coverage:\n- Moderate scrolling through code or content\n- Clicking buttons with visible state changes\n- Tab switching in browser or IDE\n- Expanding/collapsing sections\n- Cursor selection highlighting text\n- Loading spinners with content appearing after\n- Mouse moving between elements\n- Tooltips or hover states appearing\n\nSKIP - Not visually interesting:\n- Completely static screens (2+ seconds no movement)\n- Still slides or title cards held for reading\n- Mouse sitting idle\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nOUTPUT FORMAT - CREATOMATE-READY JSON\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nYOUR ENTIRE RESPONSE MUST BE EXACTLY THIS FORMAT - NOTHING ELSE:\n\n[\n {\n \"clip_id\": \"clip_001\",\n \"start\": \"0:05\",\n \"end\": \"0:08\",\n \"start_seconds\": 5,\n \"end_seconds\": 8,\n \"app\": \"VS Code\",\n \"action\": \"Rapid typing - function declaration appearing character by character\",\n \"screen_description\": \"Dark VS Code editor with JavaScript file open, green syntax highlighting\",\n \"person_visible\": false,\n \"has_usable_screen_content\": true,\n \"buffer_check\": {\n \"0:04\": \"clean\",\n \"0:05\": \"clean\",\n \"0:06.5\": \"clean\",\n \"0:08\": \"clean\",\n \"0:09\": \"clean\",\n \"summary\": \"all 5 points verified person-free\"\n },\n \"webcam_region\": null,\n \"safe_crop_zone\": null,\n \"creatomate_crop\": null\n },\n {\n \"clip_id\": \"clip_002\",\n \"start\": \"1:42\",\n \"end\": \"1:45\",\n \"start_seconds\": 102,\n \"end_seconds\": 105,\n \"app\": \"n8n\",\n \"action\": \"Dragging connection line between two workflow nodes\",\n \"screen_description\": \"n8n canvas with HTTP Request node connecting to Code node, dark theme\",\n \"person_visible\": true,\n \"has_usable_screen_content\": true,\n \"buffer_check\": {\n \"1:41\": \"webcam bottom-right\",\n \"1:42\": \"webcam bottom-right\",\n \"1:43.5\": \"webcam bottom-right\",\n \"1:45\": \"webcam bottom-right\",\n \"1:46\": \"webcam bottom-right\",\n \"summary\": \"webcam overlay consistent across all 5 points, position stable\"\n },\n \"webcam_region\": {\n \"position\": \"bottom-right\",\n \"x_min\": 0.75,\n \"y_min\": 0.72,\n \"x_max\": 1.0,\n \"y_max\": 1.0,\n \"shape\": \"rectangle\",\n \"approximate_size_percent\": 7\n },\n \"safe_crop_zone\": {\n \"x_min\": 0.0,\n \"y_min\": 0.0,\n \"x_max\": 1.0,\n \"y_max\": 0.70,\n \"strategy\": \"crop_above_webcam\",\n \"content_preserved_percent\": 70\n },\n \"creatomate_crop\": {\n \"source_x\": \"0%\",\n \"source_y\": \"0%\",\n \"source_width\": \"100%\",\n \"source_height\": \"70%\",\n \"fit\": \"fill\",\n \"clip\": true\n }\n },\n {\n \"clip_id\": \"clip_003\",\n \"start\": \"3:15\",\n \"end\": \"3:18\",\n \"start_seconds\": 195,\n \"end_seconds\": 198,\n \"app\": \"Chrome\",\n \"action\": \"Scrolling through API documentation page\",\n \"screen_description\": \"Browser showing Supabase docs with code examples visible\",\n \"person_visible\": true,\n \"has_usable_screen_content\": true,\n \"buffer_check\": {\n \"3:14\": \"webcam top-left circular\",\n \"3:15\": \"webcam top-left circular\",\n \"3:16.5\": \"webcam top-left circular\",\n \"3:18\": \"webcam top-left circular\",\n \"3:19\": \"webcam top-left circular\",\n \"summary\": \"circular webcam overlay consistent, position stable\"\n },\n \"webcam_region\": {\n \"position\": \"top-left\",\n \"x_min\": 0.01,\n \"y_min\": 0.02,\n \"x_max\": 0.18,\n \"y_max\": 0.28,\n \"shape\": \"circle\",\n \"approximate_size_percent\": 4\n },\n \"safe_crop_zone\": {\n \"x_min\": 0.0,\n \"y_min\": 0.30,\n \"x_max\": 1.0,\n \"y_max\": 1.0,\n \"strategy\": \"crop_below_webcam\",\n \"content_preserved_percent\": 70\n },\n \"creatomate_crop\": {\n \"source_x\": \"0%\",\n \"source_y\": \"30%\",\n \"source_width\": \"100%\",\n \"source_height\": \"70%\",\n \"fit\": \"fill\",\n \"clip\": true\n }\n },\n {\n \"clip_id\": \"clip_004\",\n \"start\": \"5:22\",\n \"end\": \"5:25\",\n \"start_seconds\": 322,\n \"end_seconds\": 325,\n \"app\": \"Talking Head\",\n \"action\": \"Speaker explaining concept\",\n \"screen_description\": \"Person speaking directly to camera, blurred background\",\n \"person_visible\": true,\n \"has_usable_screen_content\": false,\n \"buffer_check\": {\n \"5:21\": \"full frame person\",\n \"5:22\": \"full frame person\",\n \"5:23.5\": \"full frame person\",\n \"5:25\": \"full frame person\",\n \"5:26\": \"full frame person\",\n \"summary\": \"talking head - no usable screen content\"\n },\n \"webcam_region\": null,\n \"safe_crop_zone\": null,\n \"creatomate_crop\": null\n }\n]\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nFIELD SCHEMA - ALL FIELDS REQUIRED\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nCORE TIMING FIELDS:\n| Field | Type | Format | Example |\n|-------|------|--------|---------|\n| clip_id | string | \"clip_NNN\" | \"clip_001\" |\n| start | string | \"M:SS\" or \"MM:SS\" | \"0:05\", \"12:34\" |\n| end | string | \"M:SS\" or \"MM:SS\" | \"0:08\", \"12:37\" |\n| start_seconds | integer | seconds from 0:00 | 5, 754 |\n| end_seconds | integer | start_seconds + 3 | 8, 757 |\n\nCONTENT DESCRIPTION FIELDS:\n| Field | Type | Description |\n|-------|------|-------------|\n| app | string | Application or platform shown on screen |\n| action | string | What is visually happening (be specific about movement) |\n| screen_description | string | Brief description of what the screen looks like |\n\nPERSON DETECTION FIELDS:\n| Field | Type | Description |\n|-------|------|-------------|\n| person_visible | boolean | true if ANY person element visible at ANY buffer checkpoint |\n| has_usable_screen_content | boolean | true if main content is screen recording (even with webcam overlay) |\n| buffer_check | object | Per-timestamp verification with status for each of 5 checkpoints |\n\nWEBCAM REGION FIELDS (null if person_visible is false OR has_usable_screen_content is false):\n| Field | Type | Description |\n|-------|------|-------------|\n| webcam_region.position | string | \"top-left\", \"top-right\", \"bottom-left\", \"bottom-right\", \"custom\" |\n| webcam_region.x_min | float | Left edge as 0-1 decimal |\n| webcam_region.y_min | float | Top edge as 0-1 decimal |\n| webcam_region.x_max | float | Right edge as 0-1 decimal |\n| webcam_region.y_max | float | Bottom edge as 0-1 decimal |\n| webcam_region.shape | string | \"rectangle\", \"circle\", \"rounded\" |\n| webcam_region.approximate_size_percent | integer | Rough % of frame the webcam occupies |\n\nSAFE CROP ZONE FIELDS (null if no usable crop possible):\n| Field | Type | Description |\n|-------|------|-------------|\n| safe_crop_zone.x_min | float | Left edge of safe area as 0-1 decimal |\n| safe_crop_zone.y_min | float | Top edge of safe area as 0-1 decimal |\n| safe_crop_zone.x_max | float | Right edge of safe area as 0-1 decimal |\n| safe_crop_zone.y_max | float | Bottom edge of safe area as 0-1 decimal |\n| safe_crop_zone.strategy | string | \"crop_above_webcam\", \"crop_below_webcam\", \"crop_left_of_webcam\", \"crop_right_of_webcam\" |\n| safe_crop_zone.content_preserved_percent | integer | % of original frame preserved after crop |\n\nCREATOMATE CROP FIELDS (null if no crop needed or not usable):\n| Field | Type | Description |\n|-------|------|-------------|\n| creatomate_crop.source_x | string | Starting X position as percentage (e.g., \"0%\", \"25%\") |\n| creatomate_crop.source_y | string | Starting Y position as percentage (e.g., \"0%\", \"30%\") |\n| creatomate_crop.source_width | string | Width to extract as percentage (e.g., \"100%\", \"75%\") |\n| creatomate_crop.source_height | string | Height to extract as percentage (e.g., \"70%\", \"100%\") |\n| creatomate_crop.fit | string | Always \"fill\" for cropped content |\n| creatomate_crop.clip | boolean | Always true for cropped content |\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nCREATOMATE INTEGRATION NOTES\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nThe creatomate_crop object is designed to map directly to Creatomate's composition element.\n\nFor a clip with webcam in bottom-right (safe zone is top 70%):\n{\n \"creatomate_crop\": {\n \"source_x\": \"0%\", // Start from left edge\n \"source_y\": \"0%\", // Start from top edge\n \"source_width\": \"100%\", // Full width\n \"source_height\": \"70%\", // Only top 70%\n \"fit\": \"fill\",\n \"clip\": true\n }\n}\n\nThis translates to Creatomate RenderScript as:\n{\n \"type\": \"composition\",\n \"width\": 1080, // Your target output width\n \"height\": 1920, // Your target output height\n \"elements\": [{\n \"type\": \"video\",\n \"source\": \"VIDEO_URL\",\n \"trim_start\": start_seconds,\n \"trim_duration\": 3,\n \"x\": \"50%\",\n \"y\": \"50%\",\n \"width\": \"100%\",\n \"height\": \"142.86%\", // 100 / 0.70 = ~143% to scale up the cropped region\n \"y_anchor\": \"0%\", // Anchor to top to show the top 70%\n \"fit\": \"fill\",\n \"clip\": true\n }]\n}\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nSTRICT RESPONSE RULES\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\n- START WITH [ - First character of your response must be [\n- END WITH ] - Last character of your response must be ]\n- NO TEXT BEFORE - No \"Here are the clips:\" or any introduction\n- NO TEXT AFTER - No \"I found X clips\" or any summary\n- NO MARKDOWN - Do not wrap in ```json``` code blocks\n- VALID JSON ONLY - Response must parse with JSON.parse()\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nCLIP EXTRACTION RULES\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\n1. MAXIMIZE CLIPS - Extract as many as possible. Fill the context window.\n\n2. STRICT 3-SECOND CLIPS - Every clip must be exactly 3 seconds (end_seconds = start_seconds + 3)\n\n3. 5-SECOND BUFFER CHECK - Verify each of the 5 timestamps individually\n\n4. WEBCAM CONSISTENCY - For clips with webcam overlay, verify webcam position is stable across all 5 buffer checkpoints\n\n5. FULL VIDEO COVERAGE - Include clips from intro, middle, AND end sections proportionally\n\n6. MOVEMENT PRIORITY - Prioritize clips where things are visually happening\n\n7. ONE ACTION PER CLIP - Each clip = single focused activity\n\n8. DESCRIPTIVE ACTIONS - Describe what's visually happening, not what's being discussed\n\n9. NO OVERLAPPING - Clips should not overlap. Minimum 1 second gap between clips.\n\n10. CONSECUTIVE OK - Back-to-back clips from the same screen are fine if they capture different actions\n\n11. INCLUDE WEBCAM OVERLAYS - Clips with webcam overlays ARE usable if has_usable_screen_content is true. Provide the crop data.\n\n12. SKIP TALKING HEAD ONLY - Only skip clips where the entire frame is the person with no screen content behind them.\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nDISTRIBUTION CHECK\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\nBefore finalizing, verify your clips are distributed across the video:\n\n- First quarter (0% - 25%): Should have ~25% of clips\n- Second quarter (25% - 50%): Should have ~25% of clips\n- Third quarter (50% - 75%): Should have ~25% of clips\n- Final quarter (75% - 100%): Should have ~25% of clips\n\nIf any quarter has less than 15% of total clips, go back and find more moments in that section.\n\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\nCOMMON MISTAKES TO AVOID\n\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\n\n\u274c \"The screen recording started so it must be clean\" - WRONG, check each timestamp\n\u274c \"I didn't see a face at the start so the whole clip is fine\" - WRONG, check ALL 5 points\n\u274c \"It's mostly screen recording\" - WRONG, detect webcam overlays and provide crop data\n\u274c \"The webcam overlay is small/in the corner\" - STILL requires webcam_region and safe_crop_zone\n\u274c \"The person is blurry/transitioning\" - WRONG, still counts as person visible\n\u274c \"I checked nearby and it was clean\" - WRONG, check the EXACT timestamps listed\n\u274c Skipping clips just because they have webcam overlay - WRONG, these ARE usable with cropping\n\u274c Setting has_usable_screen_content to false when there's a webcam over screen recording - WRONG, the screen content IS usable\n\u274c Providing inconsistent webcam positions across buffer checkpoints",
"modelId": {
"__rl": true,
"mode": "list",
"value": "models/gemini-pro-latest",
"cachedResultName": "models/gemini-pro-latest"
},
"options": {},
"resource": "video",
"operation": "analyze",
"videoUrls": "={{ $json.video_url }}"
},
"credentials": {
"googlePalmApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.1
},
{
"id": "4aec8a26-0cbc-494b-957b-f4db32eb11f7",
"name": "Edit Fields",
"type": "n8n-nodes-base.set",
"position": [
3760,
4656
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "c611f8f0-bc81-4af9-82c5-d40ca070ab90",
"name": "youtube_url",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.youtube_url }}"
},
{
"id": "ac96a0a0-23b0-40b0-a291-514a649bb4ee",
"name": "video_id",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.video_id }}"
},
{
"id": "6816246e-309e-483a-91be-500e9c2f8ece",
"name": "video_name",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.video_name }}"
},
{
"id": "9a3028c8-0ba1-4a35-ad03-1298540a93a1",
"name": "video_url",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.video_url }}"
},
{
"id": "7fbca206-e0c5-4f56-a534-3ad65d06e363",
"name": "videoDescription",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.videoDescription }}"
},
{
"id": "57b1b81e-d0a1-4b46-b737-343c72f50852",
"name": "channel_title",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.channel_title }}"
},
{
"id": "74d26ea6-187f-48dd-95a1-d10d3fbc1c79",
"name": "transcript_text",
"type": "string",
"value": "={{ $('Parse Key Actions').item.json.transcript_text }}"
},
{
"id": "9e54ab35-aff4-4a18-a78b-c1423d3c7772",
"name": "segments",
"type": "array",
"value": "={{ $('Parse Key Actions').item.json.segments }}"
},
{
"id": "4469a520-7d1b-4056-95ff-cfa957696415",
"name": "key_moments",
"type": "array",
"value": "={{ $('Parse Key Actions').item.json.key_moments }}"
},
{
"id": "beabfa7b-b5b1-4fc2-bd7d-f835049060db",
"name": "visual_analysis",
"type": "object",
"value": "={{ $('Parse Key Actions').item.json.visual_analysis }}"
},
{
"id": "0b297340-ee40-4248-a480-bf06d62707df",
"name": "visual_analysis.video_duration_seconds",
"type": "number",
"value": "={{ $('Parse Key Actions').item.json.visual_analysis.video_duration_seconds }}"
},
{
"id": "c876ad00-ef7a-495a-a3d0-ef6be6dc3a19",
"name": "output[0].content[0].text.video_overview",
"type": "object",
"value": "={{ $json.output[0].content[0].text.video_overview }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "97463a77-24b3-4f9e-9f83-2b912b105c13",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
2816,
4112
],
"parameters": {
"width": 640,
"height": 432,
"content": "## Watch the tutorial \n\nhttps://youtu.be/Gg_bNn-NeI8\n\n@[youtube](Gg_bNn-NeI8)"
},
"typeVersion": 1
}
],
"active": true,
"settings": {
"callerPolicy": "workflowsFromSameOwner",
"timeSavedMode": "fixed",
"availableInMCP": true,
"executionOrder": "v1"
},
"versionId": "be652460-42d9-4ddf-af31-cde72ab4cedc",
"connections": {
"Edit Fields": {
"main": [
[
{
"node": "Call 'Shorts Creation'1",
"type": "main",
"index": 0
}
]
]
},
"Analyse Video": {
"main": [
[
{
"node": "Parse Key Actions",
"type": "main",
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
apifyApigooglePalmApiopenAiApi
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
This workflow takes any YouTube video URL and automatically extracts a rich, structured analysis — including transcript, key visual moments, video metadata, SEO keywords, and content section breakdowns. It's designed as the foundation layer for content repurposing, feeding its…
Source: https://n8n.io/workflows/13675/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
[](https://drive.google.com/file/d/1Cl0KwgRgcuBPVdGgL-nqAcheyvfVXttD/view) Click on the image to see the Example output in google drive
Categories Marketing Intelligence, Ad Operations, Competitive Research, Creative Analysis
Stop wasting hours on manual competitor research and content briefing. This workflow automates the creation of data-backed content briefs by analyzing the current top-ranking pages for your specific k
Note: Now includes an Apify alternative for Rapid API (Some users can't create new accounts on Rapid API, so I have added an alternative for you. But immediately you are able to get access to Rapid AP
This system automates LinkedIn lead generation and enrichment in six clear stages: Lead Collection (via Apollo.io) Automatically pulls leads based on keywords, roles, or industries using Apollo’s API.