{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Multilingual Video Localization",
  "tags": [],
  "nodes": [
    {
      "id": "5a7d171d-f9f3-4a92-ad0e-4f4bea085d2c",
      "name": "Sticky Note - Overview",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1360,
        880
      ],
      "parameters": {
        "width": 668,
        "height": 960,
        "content": "## Try It Out!\n### Localize a spokesperson video into another language with a new presenter \u2014 no filming required.\n\nThis workflow transcribes a video, translates the speech, generates dubbed audio, creates a lip-synced video.\n\n### How it works\n1. **Manual Trigger** starts the workflow\n2. **Set Fields** defines the target language\n3. **Read Source Video** and **Read Local Presenter Image** load the input files in parallel\n4. **deAPI Transcribe Video** extracts the original speech as text with timestamps\n5. **AI Agent** translates the transcript into the target language\n6. **deAPI Generate Speech** creates dubbed audio in the target language\n7. **deAPI Generate From Audio** produces a lip-synced talking-head video from the dubbed audio, using the local presenter image as the first frame\n\n### Requirements\n- [deAPI](https://deapi.ai) account for transcription, TTS, video generation\n- Anthropic account for the AI Agent (translation)\n- A spokesperson video\n- A reference image of the local presenter\n- n8n instance must be on **HTTPS**\n\n### Need Help?\nJoin the [n8n Discord](https://discord.gg/n8n) or ask in the [Forum](https://community.n8n.io/)!\n\nHappy Automating!"
      },
      "typeVersion": 1
    },
    {
      "id": "4845a8b4-833f-40b4-a819-306e59b27d13",
      "name": "Sticky Note - Example",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -64,
        1888
      ],
      "parameters": {
        "color": 6,
        "width": 380,
        "height": 400,
        "content": "### Example Input\n\n**Source Video:**\nAn 8-second clip of a presenter speaking in English\n\n**Local Presenter Image:**\nA photo of the person who should appear in the localized video\n\n**Target Language:**\nSpanish"
      },
      "typeVersion": 1
    },
    {
      "id": "7a3b16f3-4445-49b7-968e-c80368a31265",
      "name": "Sticky Note - Trigger",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        352,
        1888
      ],
      "parameters": {
        "color": 7,
        "width": 400,
        "height": 460,
        "content": "## 1. Start & Configure\nClick **Test Workflow** to run.\n\nThe **Set Fields** node defines:\n- **target_language** \u2014 the language for the localized video (e.g. Spanish, Japanese, French)"
      },
      "typeVersion": 1
    },
    {
      "id": "6940f5b8-44b6-4249-bd6c-d33240ef2758",
      "name": "Sticky Note - Read Files",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        784,
        1712
      ],
      "parameters": {
        "color": 7,
        "width": 452,
        "height": 764,
        "content": "## 2. Load Files\nReads both input files in parallel:\n\n**Source Video** (top branch)\n- Output field: `video`\n- The original spokesperson video\n- Formats: MP4, MPEG, MOV, AVI, WMV, OGG\n\n**Local Presenter Image** (bottom branch)\n- Output field: `image`\n- Reference photo of the local presenter\n- Formats: JPG, JPEG, PNG, GIF, BMP, WebP\n- Max size: 10 MB\n\nUpdate the **File Path** in each node."
      },
      "typeVersion": 1
    },
    {
      "id": "f34468c8-4e5b-4f2c-8c71-04422f1add7f",
      "name": "Sticky Note - Transcribe",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1280,
        1888
      ],
      "parameters": {
        "color": 7,
        "width": 424,
        "height": 476,
        "content": "## 3. Transcribe\n[deAPI Documentation](https://docs.deapi.ai)\n\n**Transcribe Video** uses **Whisper Large V3** to extract the spoken text from the video.\n\nTimestamps are included so the AI can preserve pacing during translation."
      },
      "typeVersion": 1
    },
    {
      "id": "dc61ce3a-0ed5-4f2f-8642-529c58eb5ef8",
      "name": "Sticky Note - Translate",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1744,
        1888
      ],
      "parameters": {
        "color": 7,
        "width": 400,
        "height": 540,
        "content": "## 4. Translate\nThe **AI Agent** translates the transcript into the target language.\n\nIt preserves the natural tone and pacing of the original speech, adapting idioms and cultural references for the target audience."
      },
      "typeVersion": 1
    },
    {
      "id": "6477e3df-490a-4a37-8b9c-9a6e8c5d11ad",
      "name": "Sticky Note - TTS",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2176,
        1888
      ],
      "parameters": {
        "color": 7,
        "width": 400,
        "height": 492,
        "content": "## 5. Generate Dubbed Speech\n[deAPI Documentation](https://docs.deapi.ai)\n\n**Generate Speech** uses **Qwen3** to create natural-sounding speech from the translated text.\n\nSwap for **Clone a Voice** to preserve the original speaker's voice characteristics."
      },
      "typeVersion": 1
    },
    {
      "id": "af9b5122-2305-4803-880d-efaa7bfc28de",
      "name": "Sticky Note - Audio to Video",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2608,
        1888
      ],
      "parameters": {
        "color": 7,
        "width": 528,
        "height": 588,
        "content": "## 6. Generate Lip-Synced Video\n[deAPI Documentation](https://docs.deapi.ai)\n\n**Generate From Audio** uses **LTX-2.3 22B** to create a talking-head video synced to the dubbed speech.\n\nThe local presenter image is used as the first frame to guide the visual appearance."
      },
      "typeVersion": 1
    },
    {
      "id": "c1a91124-f87e-43fe-bd88-74ff43a00a7b",
      "name": "Manual Trigger",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        384,
        2192
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "f12fc28b-b55d-4b9e-b3f9-059751d56012",
      "name": "Set Fields",
      "type": "n8n-nodes-base.set",
      "position": [
        608,
        2192
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "field-target-language",
              "name": "target_language",
              "type": "string",
              "value": "Spanish"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "d13bdc81-57a2-4a9d-be84-76826691d664",
      "name": "Read Source Video",
      "type": "n8n-nodes-base.readWriteFile",
      "position": [
        960,
        2096
      ],
      "parameters": {
        "options": {
          "dataPropertyName": "video"
        },
        "fileSelector": "/path/to/your/spokesperson-video.mp4"
      },
      "typeVersion": 1
    },
    {
      "id": "6ef5ce65-2131-4608-b307-17459c0f7ce7",
      "name": "Read Local Presenter Image",
      "type": "n8n-nodes-base.readWriteFile",
      "position": [
        960,
        2288
      ],
      "parameters": {
        "options": {
          "dataPropertyName": "image"
        },
        "fileSelector": "/path/to/your/local-presenter.jpg"
      },
      "typeVersion": 1
    },
    {
      "id": "baccfe84-3b58-439e-816a-7602083bd603",
      "name": "deAPI Transcribe Video",
      "type": "n8n-nodes-deapi.deapi",
      "position": [
        1344,
        2096
      ],
      "parameters": {
        "source": "binary",
        "options": {
          "waitTimeout": 120
        },
        "resource": "video",
        "operation": "transcribe",
        "binaryPropertyName": "video"
      },
      "credentials": {
        "deApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "a685c3d7-ba6e-4bf9-88c1-a2ffca1a628e",
      "name": "Extract from File",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        1552,
        2096
      ],
      "parameters": {
        "options": {},
        "operation": "text",
        "destinationKey": "text"
      },
      "typeVersion": 1.1
    },
    {
      "id": "91159328-15ff-4f45-abf8-9189631d8bf4",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        1840,
        2096
      ],
      "parameters": {
        "text": "=Translate the following transcript into {{ $('Set Fields').item.json.target_language }}.\n\nReturn ONLY the translated text, without timestamps, line numbers, or formatting. Preserve the natural pacing and tone of the original speech.\n\nTranscript:\n{{ $json.text }}",
        "options": {
          "systemMessage": "You are a professional translator specializing in video localization. Your goal is to produce natural-sounding translations that work well when spoken aloud.\n\nKey principles:\n- Preserve the tone, energy, and intent of the original speech\n- Adapt idioms and cultural references for the target audience\n- Keep sentences at a similar length to the original for lip-sync compatibility\n- Use natural spoken language, not formal written style\n- Return ONLY the translated text \u2014 no explanations, notes, or formatting"
        },
        "promptType": "define"
      },
      "typeVersion": 1.7
    },
    {
      "id": "9bdee101-3077-46f9-8796-7baa8488bf8d",
      "name": "Anthropic Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatAnthropic",
      "position": [
        1840,
        2288
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "claude-opus-4-6",
          "cachedResultName": "Claude Opus 4.6"
        },
        "options": {}
      },
      "credentials": {
        "anthropicApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "10f8b667-47bb-4333-a5d6-18b0617562f7",
      "name": "deAPI Generate Speech",
      "type": "n8n-nodes-deapi.deapi",
      "position": [
        2336,
        2096
      ],
      "parameters": {
        "text": "={{ $json.output }}",
        "model": "Qwen3_TTS_12Hz_1_7B_CustomVoice",
        "options": {
          "waitTimeout": 120
        },
        "resource": "audio",
        "operation": "generateSpeech",
        "qwen3Lang": "={{ $('Set Fields').item.json.target_language }}"
      },
      "credentials": {
        "deApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "bdb0e104-d7a2-4d39-b30f-d7b000bda81c",
      "name": "Merge Audio + Image",
      "type": "n8n-nodes-base.merge",
      "position": [
        2672,
        2272
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combineBy": "combineByPosition"
      },
      "typeVersion": 3.2
    },
    {
      "id": "8d95a8c1-38ba-469b-94bb-41b4faedb3f4",
      "name": "deAPI Generate From Audio",
      "type": "n8n-nodes-deapi.deapi",
      "position": [
        2896,
        2272
      ],
      "parameters": {
        "prompt": "A person speaking naturally to the camera, subtle head movements and facial expressions, professional lighting, medium close-up shot, steady camera",
        "options": {
          "frames": 241,
          "firstFrame": "image",
          "waitTimeout": 300
        },
        "resource": "video",
        "operation": "generateFromAudio"
      },
      "credentials": {
        "deApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "connections": {
    "AI Agent": {
      "main": [
        [
          {
            "node": "deAPI Generate Speech",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Fields": {
      "main": [
        [
          {
            "node": "Read Source Video",
            "type": "main",
            "index": 0
          },
          {
            "node": "Read Local Presenter Image",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Manual Trigger": {
      "main": [
        [
          {
            "node": "Set Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract from File": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read Source Video": {
      "main": [
        [
          {
            "node": "deAPI Transcribe Video",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Merge Audio + Image": {
      "main": [
        [
          {
            "node": "deAPI Generate From Audio",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Anthropic Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "deAPI Generate Speech": {
      "main": [
        [
          {
            "node": "Merge Audio + Image",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "deAPI Transcribe Video": {
      "main": [
        [
          {
            "node": "Extract from File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "deAPI Generate From Audio": {
      "main": [
        []
      ]
    },
    "Read Local Presenter Image": {
      "main": [
        [
          {
            "node": "Merge Audio + Image",
            "type": "main",
            "index": 1
          }
        ]
      ]
    }
  }
}