AutomationFlowsAI & RAG › Extract & Structure Invoice Data with Google Vision Ocr, Gemini LLM & Google…

Extract & Structure Invoice Data with Google Vision Ocr, Gemini LLM & Google…

Original n8n title: Extract & Structure Invoice Data with Google Vision Ocr, Gemini LLM & Google Sheets

ByBudi SJ @budisj on n8n.io

This workflow automates the process of collecting uploaded invoices, extracting text using Google Vision API, and processing the extracted text with an LLM to produce structured data containing key transaction details such as date, voucher number, transaction detail, vendor, and…

Event trigger★★★★☆ complexityAI-powered17 nodesChain LlmOpenRouter ChatOutput Parser StructuredGoogle DriveHTTP RequestTelegramGoogle SheetsForm Trigger
AI & RAG Trigger: Event Nodes: 17 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #7302 — we link there as the canonical source.

This workflow follows the Chainllm → Form Trigger recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "weV4UB3UBRHEfe1k",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Automated Invoice Collection & Data Extraction Using Vision API and LLM",
  "tags": [],
  "nodes": [
    {
      "id": "2b378a23-08b9-4b7e-b1c5-114087802a71",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -340,
        -480
      ],
      "parameters": {
        "width": 580,
        "height": 1200,
        "content": "# Automated Invoice Collection & Data Extraction Using Vision API and LLM\n\nThis workflow automates the process of collecting uploaded invoices, extracting text using **Google Vision API**, and processing the extracted text with an LLM to produce structured data containing key transaction details such as **date, voucher number, transaction detail, vendor, and transaction value**.  \nThe final data is saved to **Google Sheets** and a notification is sent to **Telegram** in real time.\n\n## \u2728 Key Features\n- **Invoice Upload Form**  \n  Users can upload invoice images through a provided form.\n- **Google Drive Integration**  \n  Files are stored in a specified Google Drive folder with a shareable preview link.\n- **OCR via Google Vision API**  \n  Converts invoice images to text using `TEXT_DETECTION`.\n- **Data Structuring via LLM**  \n  Uses LLM model to parse and structure data.\n- **Structured Output Parser**  \n  Ensures consistent output with required columns.\n- **Data Cleaning**  \n  Cleans and formats numeric values without currency symbols.\n- **Google Sheets Sync**  \n  Appends or updates transaction data in Google Sheets (matched by file ID).  \n  **Template:** [Google Sheets](https://docs.google.com/spreadsheets/d/1HMzQtFK9T-GDxGFSD7ErW_QLlq-PvCvoFASiHGG2fGM/edit?gid=0#gid=0)\n- **Telegram Notification**  \n  Sends a transaction summary directly to a Telegram chat/group.\n\n---\n\n## \ud83d\udd10 Required Credentials\n- **Google Vision API Key** \u2192 for OCR processing.  \n- **OpenRouter API Key** \u2192 to access the Gemini Flash LLM.  \n- **Google Drive OAuth2** \u2192 to upload and download invoice files.  \n- **Google Sheets OAuth2** \u2192 to write or update spreadsheet data.  \n- **Telegram Bot Token** \u2192 to send notifications to Telegram.  \n- **Telegram Chat ID** \u2192 target chat/group for notifications.\n\n---\n\n## \ud83c\udf81 Benefits\n- **Fully automated** from invoice upload to structured reporting.\n- **Time-saving** by eliminating manual transaction data entry.\n- **Real-time integration** with Google Sheets for reporting and auditing.\n- **Instant notifications** via Telegram for quick transaction monitoring.\n- **Duplicate prevention** using file ID as a matching key.\n- **Flexible** for accounting, finance, or administrative teams.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "e4ac27f9-6db0-489f-bd7f-033fd792fc7c",
      "name": "Set Vision API",
      "type": "n8n-nodes-base.set",
      "position": [
        1120,
        20
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "586cf442-13be-4996-9824-366e20ab864e",
              "name": "visionAPI",
              "type": "string",
              "value": "YOUR_VISION_API_KEY_HERE"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "2e81ab2f-5f8f-4676-bfbf-9753bad4369e",
      "name": "Basic LLM Chain",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        1540,
        20
      ],
      "parameters": {
        "text": "={{ $json.responses[0].fullTextAnnotation.text }}",
        "batching": {},
        "messages": {
          "messageValues": [
            {
              "message": "=You are a professional accountant experienced in preparing financial reports and classifying expenses into appropriate categories.\nYou will receive input data extracted via OCR that needs to be parsed and structured.\n\nInstructions:\n\nOutput must be in the same language as the input.\nParse and return the data in this column order:\nDATE : Extract in YYYY-MM-DD format, if not found use current date\nNO VOUCHER : Extract invoice or voucher number, if not found return not found\nTRANSACTION DETAIL : Short summary of products or services, if not found return not found\nVENDOR : Store or vendor name, if not found return not found\nVALUE : Transaction total amount as a number without currency symbol, if not found return 0\n\nMissing text fields : not found\nMissing numeric values : 0\nOutput only the structured data, no extra explanations"
            }
          ]
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 1.7
    },
    {
      "id": "d0ca7089-c557-48d8-a854-310efbc5b5be",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        1540,
        220
      ],
      "parameters": {
        "model": "google/gemini-2.0-flash-exp:free",
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "22bbee6d-a897-4f2a-b0c9-06b7574ebf8e",
      "name": "Structured Output Parser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        1700,
        220
      ],
      "parameters": {
        "jsonSchemaExample": "{\n  \"DATE\": \"2025-07-07\",\n  \"NO VOUCHER\": \"INV-001234\",\n  \"TRANSACTION DETAIL\": \"Product summary\",\n  \"VENDOR\": \"Vendor Name or Shop Name\",\n  \"VALUE\": 0\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "8a108ae0-3061-49a9-aef1-c426f078cca1",
      "name": "Download file",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        740,
        20
      ],
      "parameters": {
        "fileId": {
          "__rl": true,
          "mode": "id",
          "value": "={{ $json.id }}"
        },
        "options": {},
        "operation": "download"
      },
      "typeVersion": 3
    },
    {
      "id": "0a96b787-f8e4-408c-80d8-dc2a868c93e4",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1320,
        20
      ],
      "parameters": {
        "url": "=https://vision.googleapis.com/v1/images:annotate?key={{ $json.visionAPI }}",
        "method": "POST",
        "options": {},
        "jsonBody": "={\n  \"requests\": [\n    {\n      \"image\": {\n        \"content\": \"{{ $('Code').item.json.base64 }}\"\n      },\n      \"features\": [\n        {\n          \"type\": \"TEXT_DETECTION\"\n        }\n      ]\n    }\n  ]\n}\n",
        "sendBody": true,
        "specifyBody": "json"
      },
      "typeVersion": 4.2
    },
    {
      "id": "c2b91608-a4bc-4c65-9697-af3d05e6b097",
      "name": "Send a text message",
      "type": "n8n-nodes-base.telegram",
      "position": [
        2300,
        20
      ],
      "parameters": {
        "text": "=\ud83d\udcb3 New transaction : \n- Date : {{ $json.DATE }}\n- Transaction detail : {{ $json['TRANSACTION DETAIL'] }}\n- Vendor : {{ $json.VENDOR }}\n- Total transaction : {{ $json.VALUE }}\n\n---",
        "chatId": "YOUR_TELEGRAM_CHAT_ID",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "89ae0865-ac70-4cdb-8f14-e64dd023ede2",
      "name": "Upload file",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        520,
        20
      ],
      "parameters": {
        "name": "={{ $json.Image[0].filename }}",
        "driveId": {
          "__rl": true,
          "mode": "list",
          "value": "My Drive"
        },
        "options": {},
        "folderId": {
          "__rl": true,
          "mode": "list",
          "value": "YOUR_GOOGLE_DRIVE_FOLDER_ID"
        },
        "inputDataFieldName": "=Image"
      },
      "typeVersion": 3
    },
    {
      "id": "2dc29bd4-058e-426f-a78c-e99b15f67c82",
      "name": "Code",
      "type": "n8n-nodes-base.code",
      "position": [
        940,
        20
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// Convert file to base64 for Vision API\nconst base64 = item.binary.data.data; // 'data' is the binary name\nreturn {\n  json: {\n    base64\n  }\n};\n"
      },
      "typeVersion": 2
    },
    {
      "id": "a755e54a-5361-46aa-8db0-089a22195b96",
      "name": "Code1",
      "type": "n8n-nodes-base.code",
      "position": [
        1880,
        20
      ],
      "parameters": {
        "jsCode": "function cleanNumber(n) {\n  return parseInt(n.toString().replace(/[.,]/g, '')) || 0;\n}\n\n// Get output from $json\nconst output = $json.output || {};\n\n// If there are Items, process them. If not, skip.\nconst items = Array.isArray(output.Items)\n  ? output.Items.map(item => ({\n      ...item,\n      Quantity: cleanNumber(item.Quantity),\n      \"Unit Price\": cleanNumber(item[\"Unit Price\"]),\n      Total: cleanNumber(item.Total),\n    }))\n  : undefined;\n\n// Optional: clean Total Amount as well\nconst totalAmount = output[\"Total Amount\"]\n  ? cleanNumber(output[\"Total Amount\"])\n  : 0;\n\nreturn {\n  json: {\n    output: {\n      ...output,\n      ...(items && { Items: items }),\n      \"Total Amount\": totalAmount,\n    }\n  }\n};\n"
      },
      "typeVersion": 2
    },
    {
      "id": "606d3086-073c-40b5-b65e-c9de3101d86d",
      "name": "Append or update row in sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        2100,
        20
      ],
      "parameters": {
        "columns": {
          "value": {
            "ID": "={{ $('Download file').item.json.id }}",
            "IMG": "={{ $('Download file').item.json.webViewLink }}",
            "DATE": "={{ $json.output.DATE }}",
            "VALUE": "={{ $json.output.VALUE }}",
            "VENDOR": "={{ $json.output.VENDOR }}",
            "FILE NAME": "={{ $('Download file').item.json.name }}",
            "NO VOUCHER": "={{ $json.output['NO VOUCHER'] }}",
            "TRANSACTION DETAIL": "={{ $json.output['TRANSACTION DETAIL'] }}"
          },
          "schema": [
            {
              "id": "ID",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "ID",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "FILE NAME",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "FILE NAME",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "IMG",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "IMG",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "DATE",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "DATE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "NO VOUCHER",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "NO VOUCHER",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "TRANSACTION DETAIL",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "TRANSACTION DETAIL",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "VENDOR",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "VENDOR",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "VALUE",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "VALUE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "ID"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "YOUR_GOOGLE_SHEETS_DOCUMENT_ID"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "71add4c5-64ef-49e2-9fbf-7b6034b17705",
      "name": "On form submission",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        300,
        20
      ],
      "parameters": {
        "options": {
          "appendAttribution": false
        },
        "formTitle": "Upload Purchase Invoice",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "Image",
              "requiredField": true,
              "acceptFileTypes": ".jpg, .png, .jpeg"
            }
          ]
        },
        "responseMode": "lastNode"
      },
      "typeVersion": 2.2
    },
    {
      "id": "98b5b08e-2b7c-4360-a224-43fd6059bb34",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        260,
        -200
      ],
      "parameters": {
        "color": 2,
        "width": 600,
        "height": 400,
        "content": "- This node triggers the workflow when a user submits a form titled \"Upload Purchase Invoice\".\n- Uploads the file received from the form into a specific Google Drive folder.\n- Downloads the file from Google Drive using the file ID obtained from the previous Upload file node."
      },
      "typeVersion": 1
    },
    {
      "id": "4e897daa-cde8-4d25-8217-9f722c68fe4d",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        900,
        -200
      ],
      "parameters": {
        "color": 4,
        "width": 560,
        "height": 400,
        "content": "- Converts the downloaded invoice file (binary) into Base64 format so that it can be sent to the Google Vision API.\n- Adds the Google Vision API key into the workflow data so it can be used in the HTTP request.\n- Sends the Base64-encoded image to the Google Vision API for OCR processing."
      },
      "typeVersion": 1
    },
    {
      "id": "510e3239-fbbd-4b47-9c31-813994c3e251",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1480,
        -200
      ],
      "parameters": {
        "color": 5,
        "width": 560,
        "height": 580,
        "content": "- Processes OCR text through an LLM to extract and structure key invoice details.\n- Forces the LLM to produce output in a fixed JSON schema.\n- Cleans and normalizes numeric fields from the LLM output."
      },
      "typeVersion": 1
    },
    {
      "id": "71dcd4e0-84cc-4b20-ac60-6701cc4e5f60",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2060,
        -200
      ],
      "parameters": {
        "color": 6,
        "width": 380,
        "height": 580,
        "content": "- Writes the extracted and cleaned invoice data into a Google Sheets document, either adding a new row or updating an existing one.\n- Sends a Telegram message summarizing the new or updated transaction entry."
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "a58b0780-6c63-40bb-84bd-a621579d6eef",
  "connections": {
    "Code": {
      "main": [
        [
          {
            "node": "Set Vision API",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code1": {
      "main": [
        [
          {
            "node": "Append or update row in sheet",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Upload file": {
      "main": [
        [
          {
            "node": "Download file",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "Basic LLM Chain",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download file": {
      "main": [
        [
          {
            "node": "Code",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Vision API": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Basic LLM Chain": {
      "main": [
        [
          {
            "node": "Code1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "On form submission": {
      "main": [
        [
          {
            "node": "Upload file",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenRouter Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Basic LLM Chain",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser": {
      "ai_outputParser": [
        [
          {
            "node": "Basic LLM Chain",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "Append or update row in sheet": {
      "main": [
        [
          {
            "node": "Send a text message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow automates the process of collecting uploaded invoices, extracting text using Google Vision API, and processing the extracted text with an LLM to produce structured data containing key transaction details such as date, voucher number, transaction detail, vendor, and…

Source: https://n8n.io/workflows/7302/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Output Parser Structured, Telegram, N8N Nodes Tesseractjs +14
AI & RAG

Automatically capture, categorize, and log expenses from receipts, PDFs, voice notes, or text — powered by AI and integrated with Telegram and Google Sheets.

HTTP Request, OpenAI, Chain Llm +6
AI & RAG

This workflow is perfect for graphic designers, creative agencies, marketing teams, or freelancers who regularly use AI-generated images in their projects. It's specifically beneficial for teams that

Google Sheets, Google Drive, HTTP Request +5
AI & RAG

Automate your lead intake, scoring, and outreach pipeline. This workflow collects leads from forms, enriches and scores them using Relevance AI, routes them by quality, and triggers the right follow-u

Form Trigger, HTTP Request, Chain Llm +6
AI & RAG

This workflow automates the process of recording financial transactions from photos of receipts or shopping receipts. Users simply send an image of the receipt via Telegram. The image is processed usi

Telegram Trigger, HTTP Request, Chain Llm +7