AutomationFlowsAI & RAG › Convert Japanese Scripts to Multilingual Speech with Gpt-4 and Elevenlabs

Convert Japanese Scripts to Multilingual Speech with Gpt-4 and Elevenlabs

ByCheng Siong Chin @cschin on n8n.io

This workflow provides enterprise-grade translation and text-to-speech automation for international communication teams, content publishers, and localization services. It addresses producing high-quality multilingual audio content with consistent accuracy and natural delivery at…

Event trigger★★★★☆ complexityAI-powered21 nodesAgentAgent ToolOpenAI ChatOutput Parser StructuredHTTP Request
AI & RAG Trigger: Event Nodes: 21 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #12384 — we link there as the canonical source.

This workflow follows the Agent → Agenttool recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "U3bppdQVqjN1gMaG",
  "name": "Japanese Script to Multilingual Speech Synthesis with AI Translation",
  "tags": [],
  "nodes": [
    {
      "id": "7c182039-8de7-4161-a9f8-7d2ec0e4426d",
      "name": "Manual Trigger",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -2448,
        160
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "ca5b3553-981d-458a-8e3e-6e6094a74adc",
      "name": "Workflow Configuration",
      "type": "n8n-nodes-base.set",
      "position": [
        -2224,
        160
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "japaneseScript",
              "type": "string",
              "value": "<__PLACEHOLDER_VALUE__Japanese text to translate__>"
            },
            {
              "id": "id-2",
              "name": "targetLanguages",
              "type": "string",
              "value": "English,Spanish,French,German"
            },
            {
              "id": "id-3",
              "name": "voiceId",
              "type": "string",
              "value": "<__PLACEHOLDER_VALUE__ElevenLabs Voice ID__>"
            },
            {
              "id": "id-4",
              "name": "elevenLabsApiKey",
              "type": "string",
              "value": "<__PLACEHOLDER_VALUE__ElevenLabs API Key__>"
            },
            {
              "id": "id-5",
              "name": "voiceStability",
              "type": "number",
              "value": 0.5
            },
            {
              "id": "id-6",
              "name": "voiceSimilarityBoost",
              "type": "number",
              "value": 0.75
            },
            {
              "id": "id-7",
              "name": "modelId",
              "type": "string",
              "value": "eleven_multilingual_v2"
            }
          ]
        },
        "includeOtherFields": true
      },
      "typeVersion": 3.4
    },
    {
      "id": "ea3d6b8d-80ed-4362-a6d0-82d9d487509d",
      "name": "Translation Orchestrator Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -1864,
        160
      ],
      "parameters": {
        "text": "=Japanese script: {{ $json.japaneseScript }}\nTarget languages: {{ $json.targetLanguages }}",
        "options": {
          "systemMessage": "You are a multilingual translation orchestrator specializing in Japanese to multilingual translations.\n\nYour task is to:\n1. Call the Translation Agent Tool for EACH target language separately\n2. Ensure context-aware, culturally appropriate translations\n3. Maintain tone, formality, and nuance from the original Japanese text\n4. Return all translations in a structured format\n\nFor each language, call the Translation Agent Tool with the Japanese text and target language. Collect all results and return them in the structured output format."
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 3.1
    },
    {
      "id": "b6f0a7dd-8f9f-4946-a1c1-7f7863ddd751",
      "name": "Translation Agent Tool",
      "type": "@n8n/n8n-nodes-langchain.agentTool",
      "position": [
        -1872,
        384
      ],
      "parameters": {
        "text": "=Japanese text: {{ $fromAI(\"japaneseText\") }}\nTarget language: {{ $fromAI(\"targetLanguage\") }}",
        "options": {
          "systemMessage": "You are an expert Japanese translator with deep cultural knowledge.\n\nYour task is to:\n1. Translate the provided Japanese text to the target language\n2. Preserve the original tone, formality level, and emotional nuance\n3. Apply cultural adaptations where necessary (idioms, honorifics, cultural references)\n4. Provide context notes explaining translation choices\n5. Return the translation in the structured format\n\nEnsure the translation sounds natural in the target language while maintaining fidelity to the original meaning."
        },
        "hasOutputParser": true,
        "toolDescription": "Translates Japanese text to a target language with cultural context awareness"
      },
      "typeVersion": 3
    },
    {
      "id": "1b6b82c5-e2d3-44b7-ac99-e65832edfb74",
      "name": "OpenAI Chat Model - Orchestrator",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -2000,
        384
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini"
        },
        "options": {},
        "builtInTools": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "954573a2-d5ae-4e31-976f-5f5f8282fb02",
      "name": "OpenAI Chat Model - Translation",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1856,
        592
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini"
        },
        "options": {},
        "builtInTools": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "cea6e291-db44-4ad6-92fc-29c4bf35f37c",
      "name": "Structured Output Parser - Orchestrator",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -1584,
        384
      ],
      "parameters": {
        "schemaType": "manual",
        "inputSchema": "{\n\t\"type\": \"object\",\n\t\"properties\": {\n\t\t\"translations\": {\n\t\t\t\"type\": \"array\",\n\t\t\t\"items\": {\n\t\t\t\t\"type\": \"object\",\n\t\t\t\t\"properties\": {\n\t\t\t\t\t\"language\": {\n\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t},\n\t\t\t\t\t\"translatedText\": {\n\t\t\t\t\t\t\"type\": \"string\"\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\t\t},\n\t\t\"contextNotes\": {\n\t\t\t\"type\": \"string\"\n\t\t},\n\t\t\"culturalAdaptations\": {\n\t\t\t\"type\": \"array\",\n\t\t\t\"items\": {\n\t\t\t\t\"type\": \"string\"\n\t\t\t}\n\t\t}\n\t}\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "39c186f2-3d8e-4c67-bd33-9812cb3f4735",
      "name": "Structured Output Parser - Translation",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -1664,
        608
      ],
      "parameters": {
        "schemaType": "manual",
        "inputSchema": "{\n\t\"type\": \"object\",\n\t\"properties\": {\n\t\t\"language\": {\n\t\t\t\"type\": \"string\"\n\t\t},\n\t\t\"translatedText\": {\n\t\t\t\"type\": \"string\"\n\t\t},\n\t\t\"contextNotes\": {\n\t\t\t\"type\": \"string\"\n\t\t},\n\t\t\"culturalAdaptations\": {\n\t\t\t\"type\": \"array\",\n\t\t\t\"items\": {\n\t\t\t\t\"type\": \"string\"\n\t\t\t}\n\t\t}\n\t}\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "b1dc20cc-3260-470c-8011-15c3024a9cc4",
      "name": "Prepare Translation Request",
      "type": "n8n-nodes-base.set",
      "position": [
        -1328,
        240
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "translatedText",
              "type": "string",
              "value": "={{ $json.translations[0].translatedText }}"
            },
            {
              "id": "id-2",
              "name": "language",
              "type": "string",
              "value": "={{ $json.translations[0].language }}"
            },
            {
              "id": "id-3",
              "name": "voiceId",
              "type": "string",
              "value": "={{ $('Workflow Configuration').first().json.voiceId }}"
            },
            {
              "id": "id-4",
              "name": "elevenLabsApiKey",
              "type": "string",
              "value": "={{ $('Workflow Configuration').first().json.elevenLabsApiKey }}"
            },
            {
              "id": "id-5",
              "name": "voiceSettings",
              "type": "object",
              "value": "={{ { \"stability\": $('Workflow Configuration').first().json.voiceStability, \"similarity_boost\": $('Workflow Configuration').first().json.voiceSimilarityBoost } }}"
            },
            {
              "id": "id-6",
              "name": "modelId",
              "type": "string",
              "value": "={{ $('Workflow Configuration').first().json.modelId }}"
            }
          ]
        },
        "includeOtherFields": true
      },
      "typeVersion": 3.4
    },
    {
      "id": "36b9e86e-b15c-4376-9dd3-fccb54b1f75c",
      "name": "ElevenLabs Text-to-Speech",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -1152,
        240
      ],
      "parameters": {
        "url": "=https://api.elevenlabs.io/v1/text-to-speech/{{ $json.voiceId }}",
        "method": "POST",
        "options": {
          "response": {
            "response": {
              "responseFormat": "file",
              "outputPropertyName": "audioData"
            }
          }
        },
        "jsonBody": "={\n  \"text\": {{ $json.translatedText }},\n  \"model_id\": {{ $json.modelId }},\n  \"voice_settings\": {{ $json.voiceSettings }}\n}",
        "sendBody": true,
        "sendHeaders": true,
        "specifyBody": "json",
        "headerParameters": {
          "parameters": [
            {
              "name": "xi-api-key",
              "value": "={{ $json.elevenLabsApiKey }}"
            },
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        }
      },
      "typeVersion": 4.3
    },
    {
      "id": "9f42f910-6d78-43da-a65a-154b1400759d",
      "name": "Audio Quality Validation",
      "type": "n8n-nodes-base.code",
      "position": [
        -928,
        160
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "const item = items[0];\nconst audioData = item.binary?.audioData;\n\nif (!audioData) {\n  return [{\n    json: {\n      isValid: false,\n      error: 'No audio data received',\n      fileSize: 0\n    }\n  }];\n}\n\nconst fileSize = audioData.data ? Buffer.from(audioData.data, 'base64').length : 0;\nconst minSize = 1000;\nconst maxSize = 50 * 1024 * 1024;\n\nconst isValid = fileSize >= minSize && fileSize <= maxSize;\n\nreturn [{\n  json: {\n    isValid: isValid,\n    fileSize: fileSize,\n    fileSizeKB: Math.round(fileSize / 1024),\n    mimeType: audioData.mimeType || 'audio/mpeg',\n    fileName: audioData.fileName || 'audio.mp3',\n    error: !isValid ? 'File size ' + Math.round(fileSize / 1024) + 'KB is outside valid range (1KB - 50MB)' : null\n  },\n  binary: {\n    audioData: audioData\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "9075ac77-1754-4d0b-a620-05b9c23ab7ed",
      "name": "Check Audio Quality",
      "type": "n8n-nodes-base.if",
      "position": [
        -704,
        160
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "leftValue": "",
            "caseSensitive": false,
            "typeValidation": "loose"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "id-1",
              "operator": {
                "type": "boolean",
                "operation": "true"
              },
              "leftValue": "={{ $('Audio Quality Validation').item.json.isValid }}"
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "05ae6861-47f5-49f9-84c9-a97590ed1424",
      "name": "Standardize Audio Output",
      "type": "n8n-nodes-base.set",
      "position": [
        -464,
        192
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "status",
              "type": "string",
              "value": "success"
            },
            {
              "id": "id-2",
              "name": "audioFileName",
              "type": "string",
              "value": "={{ $json.language }}_{{ $now.toFormat(\"yyyyMMdd_HHmmss\") }}.mp3"
            },
            {
              "id": "id-3",
              "name": "audioSizeKB",
              "type": "number",
              "value": "={{ $json.fileSizeKB }}"
            },
            {
              "id": "id-4",
              "name": "audioMimeType",
              "type": "string",
              "value": "={{ $json.mimeType }}"
            },
            {
              "id": "id-5",
              "name": "deliveryFormat",
              "type": "string",
              "value": "binary"
            },
            {
              "id": "id-6",
              "name": "timestamp",
              "type": "string",
              "value": "={{ $now.toISO() }}"
            }
          ]
        },
        "includeOtherFields": true
      },
      "typeVersion": 3.4
    },
    {
      "id": "79de1456-30b2-43a7-a207-d65ab5111fd9",
      "name": "Handle Quality Failure",
      "type": "n8n-nodes-base.set",
      "position": [
        -464,
        368
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "status",
              "type": "string",
              "value": "failed"
            },
            {
              "id": "id-2",
              "name": "errorMessage",
              "type": "string",
              "value": "={{ $json.error }}"
            },
            {
              "id": "id-3",
              "name": "fileSize",
              "type": "number",
              "value": "={{ $json.fileSize }}"
            },
            {
              "id": "id-4",
              "name": "timestamp",
              "type": "string",
              "value": "={{ $now.toISO() }}"
            }
          ]
        },
        "includeOtherFields": true
      },
      "typeVersion": 3.4
    },
    {
      "id": "4caa407f-6968-467d-b10d-3635fd2e8bd4",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1328,
        -336
      ],
      "parameters": {
        "color": 6,
        "width": 448,
        "height": 336,
        "content": "## Prerequisites\nOpenAI API access with GPT-4 capabilities, active ElevenLabs subscription.\n## Use Cases\nEnterprise content localization, multilingual customer communications\n## Customization\nAdd language-specific translation agents, modify orchestration routing logic\n## Benefits\nDelivers consistent translation quality through intelligent routing"
      },
      "typeVersion": 1
    },
    {
      "id": "bb35e9cf-c635-4c3f-a098-69b773a621a9",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1776,
        -240
      ],
      "parameters": {
        "width": 400,
        "height": 240,
        "content": "## Setup Steps\n1. Configure OpenAI API key in \"Translation Orchestrator\" \n2. Set up ElevenLabs credentials in \"Text-to-Speech\" \n3. Define source and target languages in \"Workflow Configuration\" \n4. Customize orchestration logic based on content types and complexity\n5. Set quality thresholds in \"Audio Quality Validation\" matching output "
      },
      "typeVersion": 1
    },
    {
      "id": "1de0077d-55dc-4015-9b19-59cfeccb69cc",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2496,
        -272
      ],
      "parameters": {
        "width": 672,
        "height": 272,
        "content": "## How It Works\nThis workflow provides enterprise-grade translation and text-to-speech automation for international communication teams, content publishers, and localization services. It addresses producing high-quality multilingual audio content with consistent accuracy and natural delivery at scale. An AI orchestrator analyzes source content to determine optimal translation strategy, selecting specialized agents based on content type, complexity, and target languages. The translation agent processes text with contextual awareness, generating structured output that feeds into ElevenLabs' neural text-to-speech engine. Each audio file undergoes automated quality validation checking pronunciation accuracy, natural flow, and technical specifications. High-quality outputs proceed to standardized formatting for delivery, while failures trigger dedicated error handling with diagnostic reporting, ensuring reliable production of professional multilingual audio assets."
      },
      "typeVersion": 1
    },
    {
      "id": "66277d1a-66eb-482b-a5b0-8682e0754049",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1008,
        32
      ],
      "parameters": {
        "color": 7,
        "width": 736,
        "height": 576,
        "content": "## Quality validator assesses audio against standards and Output\n**Why**: Ensures output meets publication requirements before delivery, preventing costly quality issues"
      },
      "typeVersion": 1
    },
    {
      "id": "f7230d6c-f285-4d4f-9e2a-94835088b238",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2032,
        16
      ],
      "parameters": {
        "color": 7,
        "width": 640,
        "height": 720,
        "content": "## Translation agent converts text with cultural context\n**Why**: Delivers accurate, natural translations appropriate for target audience expectations and regional nuances"
      },
      "typeVersion": 1
    },
    {
      "id": "2c57d05d-3e4d-4f0c-a182-a9846581bb5d",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2480,
        16
      ],
      "parameters": {
        "color": 7,
        "width": 432,
        "height": 608,
        "content": "## Orchestrator analyzes content and selects translation strategy\n**Why**: Optimizes approach based on content complexity, domain, and language pairs for superior results"
      },
      "typeVersion": 1
    },
    {
      "id": "13c5e1eb-8698-44d1-b534-8d82b62e5581",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1376,
        16
      ],
      "parameters": {
        "color": 7,
        "width": 352,
        "height": 608,
        "content": "## ElevenLabs generates professional audio with optimized parameters\n**Why**: Creates broadcast-quality speech with proper pronunciation and natural intonation"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "availableInMCP": false,
    "executionOrder": "v1"
  },
  "versionId": "e89fbfba-0bdf-4b41-aec6-26057a686dd4",
  "connections": {
    "Manual Trigger": {
      "main": [
        [
          {
            "node": "Workflow Configuration",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Audio Quality": {
      "main": [
        [
          {
            "node": "Standardize Audio Output",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Handle Quality Failure",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Translation Agent Tool": {
      "ai_tool": [
        [
          {
            "node": "Translation Orchestrator Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Workflow Configuration": {
      "main": [
        [
          {
            "node": "Translation Orchestrator Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Audio Quality Validation": {
      "main": [
        [
          {
            "node": "Check Audio Quality",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "ElevenLabs Text-to-Speech": {
      "main": [
        [
          {
            "node": "Audio Quality Validation",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Prepare Translation Request": {
      "main": [
        [
          {
            "node": "ElevenLabs Text-to-Speech",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Translation Orchestrator Agent": {
      "main": [
        [
          {
            "node": "Prepare Translation Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model - Translation": {
      "ai_languageModel": [
        [
          {
            "node": "Translation Agent Tool",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model - Orchestrator": {
      "ai_languageModel": [
        [
          {
            "node": "Translation Orchestrator Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser - Translation": {
      "ai_outputParser": [
        [
          {
            "node": "Translation Agent Tool",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser - Orchestrator": {
      "ai_outputParser": [
        [
          {
            "node": "Translation Orchestrator Agent",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow provides enterprise-grade translation and text-to-speech automation for international communication teams, content publishers, and localization services. It addresses producing high-quality multilingual audio content with consistent accuracy and natural delivery at…

Source: https://n8n.io/workflows/12384/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

🧠 Automate end-to-end SEO blog creation and WordPress publishing using a GPT-5 multi-agent workflow with real-time research, metadata generation, and optional featured images.

Output Parser Structured, HTTP Request, OpenAI +10
AI & RAG

This workflow delivers intelligent multilingual audio content creation for global marketing teams, e-learning providers, and content production studios. It solves the complex challenge of generating c

Agent, OpenAI Chat, Output Parser Structured +2
AI & RAG

🎯 Create viral TikToks, Shorts, Reels, podcasts, and ASMR videos in minutes — all on autopilot.

OpenAI, HTTP Request, Form Trigger +7
AI & RAG

Generate AI viral videos with NanoBanana & VEO3, shared on socials via Blotato 2. Uses @blotato/n8n-nodes-blotato, googleSheets, lmChatOpenAi, toolThink. Event-driven trigger; 94 nodes.

@Blotato/N8N Nodes Blotato, Google Sheets, OpenAI Chat +9
AI & RAG

The AI-Powered Shopify SEO Content Automation is an enterprise-grade workflow that transforms product content creation for e-commerce stores. This sophisticated multi-agent system integrates GPT-4o, C

Perplexity Tool, Memory Buffer Window, Agent +15