AutomationFlowsAI & RAG › Telegram Bot: Analyze Images with Gpt-4o-mini/nvidia Vila & Generate Images…

Telegram Bot: Analyze Images with Gpt-4o-mini/nvidia Vila & Generate Images…

Original n8n title: Telegram Bot: Analyze Images with Gpt-4o-mini/nvidia Vila & Generate Images with Stable Diffusion 3

ByCheng Siong Chin @cschin on n8n.io

Transform your Telegram bot into an AI vision system using GPT-4o-Mini and NVIDIA Stable Diffusion 3. Perfect for content moderators, researchers, and developers. At start: Processes Telegram messages: images→analysis, text→image generation At Router: Routes by content type…

Event trigger★★★★☆ complexityAI-powered13 nodesTelegram TriggerHTTP RequestGmailOpenAITelegram
AI & RAG Trigger: Event Nodes: 13 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #9823 — we link there as the canonical source.

This workflow follows the Gmail → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "zeql9BTOvMW9EP5c",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Smart Telegram Bot: Text\u2194Image via GPT-4V-NVIDIA Vila, Email the Results",
  "tags": [],
  "nodes": [
    {
      "id": "1e0660c6-9518-49d7-aa32-4228d94b8338",
      "name": "\ud83d\udcf1 Telegram Trigger",
      "type": "n8n-nodes-base.telegramTrigger",
      "position": [
        -832,
        -80
      ],
      "parameters": {
        "updates": [
          "message"
        ],
        "additionalFields": {}
      },
      "credentials": {
        "telegramApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "69846c8f-72b2-45e2-b32b-6b53716ee5f7",
      "name": "\ud83d\udccb Extract Message Data",
      "type": "n8n-nodes-base.code",
      "position": [
        -656,
        -80
      ],
      "parameters": {
        "jsCode": "// Extract and prepare message data\nconst item = $input.item.json;\nconst messageData = {\n  text: item.message?.text || item.message?.caption || '',\n  from: item.message?.from?.username || 'unknown',\n  chatId: item.message?.chat?.id,\n  messageId: item.message?.message_id,\n  timestamp: new Date().toISOString(),\n  hasPhoto: !!item.message?.photo,\n  hasDocument: !!item.message?.document,\n  hasAudio: !!item.message?.audio,\n  hasVoice: !!item.message?.voice,\n  photoFileId: item.message?.photo?.[item.message.photo.length - 1]?.file_id || null,\n  audioFileId: item.message?.audio?.file_id || null,\n  voiceFileId: item.message?.voice?.file_id || null\n};\n\n// Determine content type\nif (messageData.hasPhoto) {\n  messageData.contentType = 'image';\n} else if (messageData.hasAudio) {\n  messageData.contentType = 'audio';\n} else if (messageData.hasVoice) {\n  messageData.contentType = 'voice';\n} else if (messageData.hasDocument) {\n  messageData.contentType = 'document';\n} else {\n  messageData.contentType = 'text';\n}\n\nconsole.log('\ud83d\udce5 Processing message:', messageData);\nreturn { json: messageData };"
      },
      "typeVersion": 2
    },
    {
      "id": "46a6c0fe-4788-49e5-9476-e0e6adb9b505",
      "name": "NVIDIA API NVIDIA Stable Diffusion 3",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -96,
        112
      ],
      "parameters": {
        "url": "https://ai.api.nvidia.com/v1/genai/stabilityai/stable-diffusion-3-medium",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "prompt",
              "value": "={{ $json.result.text }}"
            },
            {
              "name": "cfg_scale",
              "value": "5"
            },
            {
              "name": "aspect_ratio",
              "value": "16:9"
            },
            {
              "name": "seed",
              "value": "0"
            },
            {
              "name": "steps",
              "value": "50"
            },
            {
              "name": "negative_prompt"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {
              "name": "Authorization",
              "value": "Bearer YOUR_TOKEN_HERE"
            },
            {
              "name": "Accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpBearerAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "a6e5eaa6-1996-477f-8856-304b5ae427a3",
      "name": "HTTP Nvidia Vila",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -48,
        -272
      ],
      "parameters": {
        "url": "https://ai.api.nvidia.com/v1/vlm/nvidia/vila",
        "method": "POST",
        "options": {},
        "jsonBody": "={\n  \"messages\": [\n    {\n      \"role\": \"user\",\n      \"content\": \"Describe this image {{ $json.result.file_id }}    >\"\n    }\n  ],\n  \"max_tokens\": 1024,\n  \"temperature\": 0.20,\n  \"top_p\": 0.70,\n  \"stream\": false,\n  \"model\": \"nvidia/vila\"\n}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "genericCredentialType"
      },
      "credentials": {
        "httpBearerAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "5391f189-99fb-4dd7-bbae-86edb1cb8cd4",
      "name": "\ud83d\udd00 Merge AI Results",
      "type": "n8n-nodes-base.merge",
      "position": [
        176,
        -176
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "joinMode": "keepEverything",
        "fieldsToMatchString": "choices[0].message.content, content"
      },
      "typeVersion": 3
    },
    {
      "id": "65a4515d-97f2-4660-963b-d63d315640b5",
      "name": "Send a message of image-to-text results",
      "type": "n8n-nodes-base.gmail",
      "position": [
        336,
        -176
      ],
      "parameters": {
        "sendTo": " ",
        "message": "=Hi Sir/Mdm,\n{{ $json.choices[0].message.content }}\n ",
        "options": {},
        "subject": "Results"
      },
      "credentials": {
        "gmailOAuth2": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2.1
    },
    {
      "id": "5dc968b3-ad4f-41ca-a8ef-90945bddd604",
      "name": "Send a message of text-to-image results",
      "type": "n8n-nodes-base.gmail",
      "position": [
        304,
        112
      ],
      "parameters": {
        "sendTo": " ",
        "message": "=Hi Sir/Mdm,\nPls see the attached file based on your input.\n ",
        "options": {
          "attachmentsUi": {
            "attachmentsBinary": [
              {}
            ]
          }
        },
        "subject": "Results"
      },
      "credentials": {
        "gmailOAuth2": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2.1
    },
    {
      "id": "77d75186-4dfa-4b87-af2b-7144079569f3",
      "name": "Convert to Binary File",
      "type": "n8n-nodes-base.code",
      "position": [
        128,
        112
      ],
      "parameters": {
        "jsCode": "// Process NVIDIA API response and convert base64 to binary\nconst items = $input.all();\n\nreturn items.map(item => {\n  // NVIDIA API returns base64 image in 'image' field\n  let base64Image;\n  \n  // Check different possible response formats\n  if (item.json.image) {\n    base64Image = item.json.image;\n  } else if (item.json.artifacts && item.json.artifacts[0]) {\n    base64Image = item.json.artifacts[0].base64;\n  } else if (item.json.data && item.json.data[0]) {\n    base64Image = item.json.data[0].b64_json;\n  }\n  \n  // Clean base64 string (remove data URL prefix if present)\n  const base64Clean = base64Image.replace(/^data:image\\/\\w+;base64,/, '');\n  \n  // Convert to buffer\n  const buffer = Buffer.from(base64Clean, 'base64');\n  \n  return {\n    json: {\n      fileName: 'generated-image.png',\n      mimeType: 'image/png',\n      prompt: item.json.prompt || 'Generated image'\n    },\n    binary: {\n      data: {\n        data: buffer.toString('base64'),\n        mimeType: 'image/png',\n        fileName: 'generated-image.png'\n      }\n    }\n  };\n});"
      },
      "typeVersion": 2
    },
    {
      "id": "2ffd5359-3dbe-407f-8933-ff320c8a1bd1",
      "name": "Analyze image using GPT-4O-Mini",
      "type": "@n8n/n8n-nodes-langchain.openAi",
      "position": [
        -48,
        -80
      ],
      "parameters": {
        "modelId": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini",
          "cachedResultName": "GPT-4O-MINI"
        },
        "options": {},
        "resource": "image",
        "inputType": "base64",
        "operation": "analyze"
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.8
    },
    {
      "id": "8cd4cdf1-0f3e-4eec-89d0-3a6f743ab985",
      "name": "\ud83d\udd00 Route by Content Type - Image or Text",
      "type": "n8n-nodes-base.switch",
      "position": [
        -496,
        -80
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "outputKey": "image",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "192c9420-66ef-4732-b27d-a9cbdde74210",
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.contentType }}",
                    "rightValue": "image"
                  }
                ]
              },
              "renameOutput": true
            },
            {
              "outputKey": "text",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "fb219757-9c3f-4ce6-a339-c2ba6918c92c",
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.contentType }}",
                    "rightValue": "text"
                  }
                ]
              },
              "renameOutput": true
            }
          ]
        },
        "options": {
          "fallbackOutput": 2
        }
      },
      "typeVersion": 3.2
    },
    {
      "id": "407b4d3b-eebd-443a-9b62-f43fa8b92668",
      "name": "Get Text from Telegram",
      "type": "n8n-nodes-base.telegram",
      "position": [
        -272,
        112
      ],
      "parameters": {
        "text": "=\ud83d\udcac **Text Message Received**\n\nI received your text message. For advanced AI processing, please send an image with your question or request.\n\n**Your message:** {{ $json.text }}\n\nPls check your email",
        "chatId": "={{ $json.chatId }}",
        "additionalFields": {}
      },
      "credentials": {
        "telegramApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "ae9d115e-c443-4fc1-baf7-b5e687430475",
      "name": "\ud83d\udcf8 Get Image file from Telegram",
      "type": "n8n-nodes-base.telegram",
      "position": [
        -272,
        -176
      ],
      "parameters": {
        "fileId": "={{ $json.photoFileId }}",
        "resource": "file",
        "additionalFields": {}
      },
      "credentials": {
        "telegramApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "f3e451f4-c75d-4fc5-b8a4-acb7a3ff31f5",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        496,
        -704
      ],
      "parameters": {
        "width": 656,
        "height": 1040,
        "content": "## Introduction\nTransform your Telegram bot into an AI vision system using GPT-4o-Mini and NVIDIA Stable Diffusion 3. Perfect for content moderators, researchers, and developers.\n\n## Workflow Explanatory\n1. **At start**: Processes Telegram messages: images\u2192analysis, text\u2192image generation\n2. **At Router**: Routes by content type\n3. **Upper path**: Analyzes images using Nvidia Vila + GPT-4o-Mini\n4. **Lower path**: Generates images from text via Stable Diffusion 3\n5. **At Merge**: Combines AI results\n6. **At Gmail**: Emails processed results\n\n## How It Works\n1. **Telegram Trigger** listens for messages (images, text, documents)\n2. **Content Router** directs images \u2192 AI analysis, text \u2192 image generation\n3. **Image Analysis**: Downloads image \u2192 GPT-4o-Mini vision analysis \u2192 Email results\n4. **Image Generation**: Text prompt \u2192 Stable Diffusion 3 \u2192 Email generated image\n5. **Gmail Notifications** send formatted reports\n\n## Prerequisites\n- Telegram Bot token (via @BotFather)\n- OpenAI API key (GPT-4 Vision)\n- NVIDIA API key (free tier available)\n- Gmail OAuth2 credentials\n\n## Setup Steps\n1. ** Create Telegram Bot** - Create Telegram bot and obtain token\n2. ** Configure API Credentials** - Configure API credentials in HTTP Request nodes\n3. ** Set Up Gmail OAuth2** - Set up Gmail OAuth2\n4. ** Import and Activate Workflow** - Import workflow, update credentials, and activate\n\n## Customization Options\n- Add more AI models (Anthropic, Gemini)\n- Route audio/documents to transcription/OCR\n- Replace Gmail with Slack or Discord\n- Connect to databases for storage\n\n## Benefits\n- **Speed**: Seconds per analysis vs. hours manually\n- **Accuracy**: AI-powered visual understanding\n- **Intelligence**: Historical tracking enables trend analysis"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "185d529d-b861-4c3a-b4af-fe30a79e7093",
  "connections": {
    "HTTP Nvidia Vila": {
      "main": [
        [
          {
            "node": "\ud83d\udd00 Merge AI Results",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\ud83d\udcf1 Telegram Trigger": {
      "main": [
        [
          {
            "node": "\ud83d\udccb Extract Message Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\ud83d\udd00 Merge AI Results": {
      "main": [
        [
          {
            "node": "Send a message of image-to-text results",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Convert to Binary File": {
      "main": [
        [
          {
            "node": "Send a message of text-to-image results",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get Text from Telegram": {
      "main": [
        [
          {
            "node": "NVIDIA API NVIDIA Stable Diffusion 3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\ud83d\udccb Extract Message Data": {
      "main": [
        [
          {
            "node": "\ud83d\udd00 Route by Content Type - Image or Text",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Analyze image using GPT-4O-Mini": {
      "main": [
        [
          {
            "node": "\ud83d\udd00 Merge AI Results",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "\ud83d\udcf8 Get Image file from Telegram": {
      "main": [
        [
          {
            "node": "Analyze image using GPT-4O-Mini",
            "type": "main",
            "index": 0
          },
          {
            "node": "HTTP Nvidia Vila",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "NVIDIA API NVIDIA Stable Diffusion 3": {
      "main": [
        [
          {
            "node": "Convert to Binary File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\ud83d\udd00 Route by Content Type - Image or Text": {
      "main": [
        [
          {
            "node": "\ud83d\udcf8 Get Image file from Telegram",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Get Text from Telegram",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Transform your Telegram bot into an AI vision system using GPT-4o-Mini and NVIDIA Stable Diffusion 3. Perfect for content moderators, researchers, and developers. At start: Processes Telegram messages: images→analysis, text→image generation At Router: Routes by content type…

Source: https://n8n.io/workflows/9823/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Send a target niche and location via Telegram message Workflow discovers businesses via Google Maps API AI enriches contacts with email and LinkedIn data via Serper GPT-4o scores and qualifies each le

Telegram Trigger, OpenAI, Google Sheets +3
AI & RAG

💥 Automate YouTube thumbnail creation from video links -vide. Uses telegramTrigger, httpRequest, googleDrive, gmail. Event-driven trigger; 25 nodes.

Telegram Trigger, HTTP Request, Google Drive +6
AI & RAG

💥 Automate YouTube thumbnail creation from video links -vide. Uses telegramTrigger, httpRequest, googleDrive, gmail. Event-driven trigger; 25 nodes.

Telegram Trigger, HTTP Request, Google Drive +6
AI & RAG

This n8n template demonstrates how to capture Telegram voice messages, transcribe them into text using AssemblyAI, analyze the transcript with AI for summary and sentiment insights, and finally delive

Telegram, HTTP Request, OpenAI +2
AI & RAG

Ask questions like “How much did I spend on food last month?” and get instant answers from your financial data — directly in Telegram.

Telegram Trigger, OpenAI, Google Sheets +2