This workflow corresponds to n8n.io template #14190 — we link there as the canonical source.
This workflow follows the Agent → Anthropic Chat recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"meta": {
"templateCredsSetupCompleted": true
},
"name": "Voice Clone Talking Avatar",
"tags": [],
"nodes": [
{
"id": "646f3aea-5b19-40fd-af2f-302c1d1bf63d",
"name": "Sticky Note - Overview",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1072,
-1504
],
"parameters": {
"width": 668,
"height": 820,
"content": "## Try It Out!\n### Clone a voice from a short audio clip and generate a talking avatar video with a custom first frame.\n\nThis workflow reads a reference audio file and a first frame image, clones the voice to speak new text, and generates a talking avatar video using the image as the opening frame.\n\n### How it works\n1. **Manual Trigger** starts the workflow\n2. **Set Fields** defines the text for the avatar to speak and the video prompt\n3. **Read Reference Audio** loads a short voice sample into the `audio` binary field\n4. **Read First Frame Image** loads the avatar image into the `image` binary field\n5. **deAPI Clone a Voice** clones the voice and generates new speech\n6. **Merge** combines the cloned audio and the first frame image\n7. **AI Agent** crafts a talking-avatar-optimized prompt and boosts it with the **deAPI Video Prompt Booster** tool, using the first frame image for visual context\n8. **deAPI Generate From Audio** creates a talking avatar video synced to the cloned speech, using the AI-crafted prompt and first frame image\n\n### Requirements\n- [deAPI](https://deapi.ai) account for voice cloning, prompt boosting, and video generation\n- Anthropic account for the AI Agent\n- A short reference audio file (3-10 seconds, MP3/WAV/FLAC/OGG)\n- A first frame image for the avatar (PNG/JPG)\n- n8n instance must be on **HTTPS**\n\n### Need Help?\nJoin the [n8n Discord](https://discord.gg/n8n) or ask in the [Forum](https://community.n8n.io/)!\n\nHappy Automating!"
},
"typeVersion": 1
},
{
"id": "d2771ad3-78e1-4c2f-a793-4e25751ef170",
"name": "Sticky Note - Trigger",
"type": "n8n-nodes-base.stickyNote",
"position": [
-2064,
-544
],
"parameters": {
"color": 7,
"width": 436,
"height": 540,
"content": "## 1. Start & Configure\nClick **Test Workflow** to run.\n\nThe **Set Fields** node defines:\n- **text** \u2014 what the avatar will say\n- **video_prompt** \u2014 visual description for the avatar video\n- **lang** \u2014 language for voice cloning"
},
"typeVersion": 1
},
{
"id": "64ec1cde-af35-4658-8ba0-6936bfa76578",
"name": "Sticky Note - Read Files",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1600,
-736
],
"parameters": {
"color": 7,
"width": 452,
"height": 828,
"content": "## 2. Load Files\nReads both input files in parallel into separate binary fields:\n\n**Reference Audio** (top branch)\n- Output field: `audio`\n- Duration: 3-10 seconds\n- Formats: MP3, WAV, FLAC, OGG, M4A\n- Max size: 10 MB\n\n**First Frame Image** (bottom branch)\n- Output field: `image`\n- The avatar image shown as the video's opening frame\n- Formats: PNG, JPG\n\nUpdate the **File Path** in each node."
},
"typeVersion": 1
},
{
"id": "4ffd73a6-6797-44c4-ac4b-f76394d5d7da",
"name": "Sticky Note - Clone",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1088,
-656
],
"parameters": {
"color": 7,
"width": 380,
"height": 540,
"content": "## 3. Clone Voice\n[deAPI Documentation](https://docs.deapi.ai)\n\n**Clone a Voice** uses **Qwen3 TTS VoiceClone** to clone the voice from the reference audio."
},
"typeVersion": 1
},
{
"id": "efaaee2d-7061-421a-91ac-5cb2db7c65b8",
"name": "Sticky Note - Generate Video",
"type": "n8n-nodes-base.stickyNote",
"position": [
-608,
-592
],
"parameters": {
"color": 7,
"width": 856,
"height": 764,
"content": "## 4. Merge, AI Boost & Generate Video\n[deAPI Documentation](https://docs.deapi.ai)\n\n**Merge** combines the cloned audio (`audio`) with the first frame image (`image`) into a single item.\n\n**AI Agent** takes the user's video prompt and speech text, crafts a talking-avatar-optimized prompt focusing on lip sync, facial expressions, and natural movement, then uses the **Video Prompt Booster** tool with the first frame image for final optimization.\n\n**Generate From Audio** uses **LTX-2.3 22B** to create a talking avatar video:\n- Audio in `audio` drives the speech sync\n- Image in `image` sets the opening frame\n- AI-crafted prompt guides the visual scene"
},
"typeVersion": 1
},
{
"id": "588ff323-1e9d-454f-9c50-e98891276a93",
"name": "Sticky Note - Example",
"type": "n8n-nodes-base.stickyNote",
"position": [
-2512,
-544
],
"parameters": {
"color": 6,
"width": 416,
"height": 560,
"content": "### Example Input\n\n**Reference Audio:**\nA 5-second clip of someone speaking naturally\n\n**First Frame Image:**\nA photo or AI-generated image of the avatar/presenter\n\n**Text to Speak:**\n\"Welcome to our channel! Today we're going to explore the latest advancements in artificial intelligence and how they can help your business grow.\"\n\n**Video Prompt:**\n\"A professional presenter speaking confidently in a modern, well-lit studio with a blurred tech background\"\n\n**Language:**\nEnglish"
},
"typeVersion": 1
},
{
"id": "d938d462-6785-4997-96ae-bb19e06f7fa8",
"name": "Manual Trigger",
"type": "n8n-nodes-base.manualTrigger",
"position": [
-2000,
-240
],
"parameters": {},
"typeVersion": 1
},
{
"id": "f0d4f059-282c-4e05-a99b-5ffeb04c414e",
"name": "Set Fields",
"type": "n8n-nodes-base.set",
"position": [
-1792,
-240
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "field-text",
"name": "text",
"type": "string",
"value": "Welcome to our channel! Today we're going to explore the latest advancements in artificial intelligence and how they can help your business grow."
},
{
"id": "field-video-prompt",
"name": "video_prompt",
"type": "string",
"value": "A professional presenter speaking confidently in a modern, well-lit studio with a blurred tech background"
},
{
"id": "field-lang",
"name": "lang",
"type": "string",
"value": "English"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "f4a9a87f-3ade-45dc-9bde-f7525666601d",
"name": "Read Reference Audio",
"type": "n8n-nodes-base.readWriteFile",
"position": [
-1424,
-368
],
"parameters": {
"options": {
"dataPropertyName": "audio"
},
"fileSelector": "/path/to/your/reference-audio.mp3"
},
"typeVersion": 1
},
{
"id": "19ae98a9-224f-4477-82ef-f6dcca6360de",
"name": "Read First Frame Image",
"type": "n8n-nodes-base.readWriteFile",
"position": [
-1424,
-112
],
"parameters": {
"options": {
"dataPropertyName": "image"
},
"fileSelector": "/path/to/your/avatar-image.png"
},
"typeVersion": 1
},
{
"id": "2d6b23b0-b162-4107-affe-4c25a513678d",
"name": "deAPI Clone a Voice",
"type": "n8n-nodes-deapi.deapi",
"position": [
-944,
-368
],
"parameters": {
"lang": "={{ $('Set Fields').item.json.lang }}",
"text": "={{ $('Set Fields').item.json.text }}",
"options": {
"waitTimeout": 120
},
"refAudio": "={{ $('Read Reference Audio').item.binary.audio }}",
"resource": "audio"
},
"credentials": {
"deApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "27248ed4-ab57-429b-bb4c-0e93fe9c9d84",
"name": "AI Agent",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
-304,
-208
],
"parameters": {
"text": "=Create an optimized video generation prompt for a talking avatar video.\n\nThe user wants the avatar to say:\n\"{{ $('Set Fields').item.json.text }}\"\n\nTheir initial video prompt idea:\n\"{{ $('Set Fields').item.json.video_prompt }}\"\n\nUse the videoPromptBooster tool to optimize your final prompt. Return ONLY the boosted prompt text, nothing else.",
"options": {
"systemMessage": "You are a specialist in creating prompts for talking avatar videos. Your goal is to produce a single, highly descriptive video generation prompt that will result in a realistic talking avatar.\n\nKey principles for talking avatar prompts:\n- Emphasize natural lip synchronization and mouth movements matching speech\n- Include subtle facial expressions, eye blinks, and micro-expressions\n- Describe natural head movements (slight nods, tilts) that accompany speech\n- Specify consistent lighting and camera angle (typically front-facing, head-and-shoulders)\n- Maintain visual consistency with the provided first frame image\n- Match the tone of the speech content (e.g., enthusiastic, professional, casual)\n\nWorkflow:\n1. Analyze the speech text to understand the tone and energy level\n2. Refine the user's video prompt idea to focus on talking avatar qualities\n3. Use the videoPromptBooster tool to optimize your refined prompt\n4. Return ONLY the final boosted prompt text \u2014 no explanations, no formatting"
},
"promptType": "define"
},
"typeVersion": 1.7
},
{
"id": "b12a2aeb-2b81-4e3e-9dd3-3aedb6019dc8",
"name": "Anthropic Chat Model",
"type": "@n8n/n8n-nodes-langchain.lmChatAnthropic",
"position": [
-400,
-16
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "claude-opus-4-6",
"cachedResultName": "Claude Opus 4.6"
},
"options": {}
},
"credentials": {
"anthropicApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.3
},
{
"id": "3b3078a2-6848-4933-b402-1a9b9c9015fd",
"name": "Video prompt booster in deAPI",
"type": "n8n-nodes-deapi.deapiTool",
"position": [
-64,
-16
],
"parameters": {
"prompt": "={{ /*n8n-auto-generated-fromAI-override*/ $fromAI('Prompt', ``, 'string') }}",
"options": {
"binaryPropertyName": "={{ $('Read First Frame Image').item.binary.image }}"
},
"resource": "prompt",
"operation": "boostVideo"
},
"credentials": {
"deApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "bc99539a-9cdd-407c-a4bf-a8b734589bba",
"name": "deAPI Generate From Audio",
"type": "n8n-nodes-deapi.deapi",
"position": [
80,
-208
],
"parameters": {
"prompt": "={{ $json.output }}",
"options": {
"frames": 241,
"firstFrame": "={{ $('Read First Frame Image').item.binary.image }}",
"waitTimeout": 300
},
"resource": "video",
"operation": "generateFromAudio",
"audioBinaryProperty": "={{ $('deAPI Clone a Voice').item.binary.data }}"
},
"credentials": {
"deApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "1c978df7-b6d0-4ca9-a074-abc8254d57b7",
"name": "Merge",
"type": "n8n-nodes-base.merge",
"position": [
-544,
-208
],
"parameters": {
"mode": "combine",
"options": {},
"combineBy": "combineByPosition"
},
"typeVersion": 3.2
}
],
"active": false,
"settings": {
"executionOrder": "v1"
},
"connections": {
"Merge": {
"main": [
[
{
"node": "AI Agent",
"type": "main",
"index": 0
}
]
]
},
"AI Agent": {
"main": [
[
{
"node": "deAPI Generate From Audio",
"type": "main",
"index": 0
}
]
]
},
"Set Fields": {
"main": [
[
{
"node": "Read Reference Audio",
"type": "main",
"index": 0
},
{
"node": "Read First Frame Image",
"type": "main",
"index": 0
}
]
]
},
"Manual Trigger": {
"main": [
[
{
"node": "Set Fields",
"type": "main",
"index": 0
}
]
]
},
"deAPI Clone a Voice": {
"main": [
[
{
"node": "Merge",
"type": "main",
"index": 0
}
]
]
},
"Anthropic Chat Model": {
"ai_languageModel": [
[
{
"node": "AI Agent",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Read Reference Audio": {
"main": [
[
{
"node": "deAPI Clone a Voice",
"type": "main",
"index": 0
}
]
]
},
"Read First Frame Image": {
"main": [
[
{
"node": "Merge",
"type": "main",
"index": 1
}
]
]
},
"Video prompt booster in deAPI": {
"ai_tool": [
[
{
"node": "AI Agent",
"type": "ai_tool",
"index": 0
}
]
]
}
}
}
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
anthropicApideApi
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
Content creators who want a consistent on-screen avatar without filming themselves Marketing teams producing personalized video messages at scale Educators building video lessons with a virtual presenter Anyone who wants to turn text into a talking avatar video using a cloned…
Source: https://n8n.io/workflows/14190/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
Marketing teams localizing video content for international markets E-commerce brands creating product videos for multiple regions Agencies producing multilingual ad campaigns for global clients Educat
Jd-Resume-Generator. Uses formTrigger, lmChatAnthropic, agent, readWriteFile. Event-driven trigger; 38 nodes.
E-commerce store owners using Shopify Product managers who need consistent product imagery Marketing teams looking to automate visual content creation Dropshipping businesses needing quick product pho
Marketing teams who need quick video ads without a production crew E-commerce sellers promoting products on social media Freelancers and agencies producing ad creatives for clients Anyone who wants to
Teams who upload meeting recordings to YouTube (unlisted or private) and want automated notes Project managers who need to track action items across recurring meetings Remote teams who want searchable