{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "fcfc23ba-55ac-4aa1-a83b-583e66e3f351",
      "name": "Main Overview",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -240,
        -64
      ],
      "parameters": {
        "width": 380,
        "height": 732,
        "content": "## Invoice OCR Processor with Mistral AI\n\nAutomatically extract data from invoices using Mistral's dedicated OCR model and GPT-4o-mini. Process multiple PDF, PNG, or JPG invoices, extract financial fields, validate with confidence scoring, and save to Google Sheets.\n\n### How it works\n1. User uploads invoices via the form (supports multiple files)\n2. Each file is converted to base64 and sent to Mistral OCR\n3. GPT-4o-mini extracts structured invoice fields\n4. Automatic validation assigns confidence scores\n5. Results saved to Google Sheets with status tracking\n\n### Setup steps\n- [ ] Create Mistral API Key at console.mistral.ai\n- [ ] Create OpenAI API Key at platform.openai.com\n- [ ] Connect Google Sheets credentials\n- [ ] Create a Sheet with the required columns\n- [ ] Update `YOUR_GOOGLE_SHEET_ID` in the Sheets node\n\n### Fields extracted\nInvoice Number, Date, Vendor Name, Tax ID, Subtotal, Tax Rate, Tax Amount, Total, Currency"
      },
      "typeVersion": 1
    },
    {
      "id": "1776f4d6-46c9-45a9-bcba-c5f495287ba4",
      "name": "Warning - Credentials",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        160,
        -64
      ],
      "parameters": {
        "color": 2,
        "width": 844,
        "height": 372,
        "content": "## \u26a0\ufe0f Required Credentials\n\n**1. Mistral API Key**\n- Go to [console.mistral.ai](https://console.mistral.ai)\n- Create HTTP Header Auth credential in n8n\n- Header Name: `Authorization`\n- Header Value: `Bearer YOUR_TOKEN_HERE`\n\n**2. OpenAI API Key**\n- Standard OpenAI credential\n\n**3. Google Sheets**\n- Create Sheet with columns:\n`Invoice Number | Invoice Date | Vendor Name | Vendor Tax ID | Subtotal | Tax Rate (%) | Tax Amount | Total Amount | Currency | Filename | Confidence | Status | Issues | Processed At | Pages`"
      },
      "typeVersion": 1
    },
    {
      "id": "367c1a8f-4b1e-48bc-b735-84ba4dc55f90",
      "name": "Section 1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        160,
        336
      ],
      "parameters": {
        "color": 6,
        "width": 392,
        "height": 328,
        "content": "## 1. Input"
      },
      "typeVersion": 1
    },
    {
      "id": "b2a8ec06-1c62-497c-8c40-53786e5acf4d",
      "name": "Section 2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        576,
        336
      ],
      "parameters": {
        "color": 6,
        "width": 728,
        "height": 328,
        "content": "## 2. OCR Processing"
      },
      "typeVersion": 1
    },
    {
      "id": "e7561968-3f94-4fb8-a5c4-3fadb9e289a9",
      "name": "Section 3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1328,
        336
      ],
      "parameters": {
        "color": 6,
        "width": 440,
        "height": 328,
        "content": "## 3. AI Extraction"
      },
      "typeVersion": 1
    },
    {
      "id": "8c43ab4e-a9b6-4cbf-a829-a1515507b241",
      "name": "Section 4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1792,
        336
      ],
      "parameters": {
        "color": 6,
        "width": 352,
        "height": 328,
        "content": "## 4. Output"
      },
      "typeVersion": 1
    },
    {
      "id": "e14a7935-b21e-4901-9f0c-5591ac7e05d2",
      "name": "Invoice Upload Form",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        208,
        416
      ],
      "parameters": {
        "options": {},
        "formTitle": "Invoice Upload",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "Invoices",
              "multipleFiles": true,
              "requiredField": true,
              "acceptFileTypes": ".pdf,.png,.jpg,.jpeg"
            }
          ]
        },
        "formDescription": "Upload one or more invoices (PDF, PNG, or JPG) to extract data automatically."
      },
      "typeVersion": 2.2
    },
    {
      "id": "1f06b5ad-962a-40da-9222-0cfb85a55f1e",
      "name": "Split Files",
      "type": "n8n-nodes-base.code",
      "position": [
        384,
        416
      ],
      "parameters": {
        "jsCode": "// Split binary files from Form Trigger into separate items\nconst binaryData = $input.first().binary;\nconst jsonData = $input.first().json;\n\nif (!binaryData || Object.keys(binaryData).length === 0) {\n  throw new Error('No files uploaded');\n}\n\nconst output = Object.entries(binaryData).map(([key, fileData]) => {\n  return {\n    json: {\n      ...jsonData,\n      binaryKey: key,\n      filename: fileData.fileName || key,\n      mimeType: fileData.mimeType || 'application/octet-stream'\n    },\n    binary: {\n      data: fileData\n    }\n  };\n});\n\nreturn output;"
      },
      "typeVersion": 2
    },
    {
      "id": "a6a110ad-3317-421e-aa42-c6b0f9900b8d",
      "name": "Loop",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        624,
        416
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "8ead97c6-2786-4cc8-9beb-67730fbb1ced",
      "name": "Prepare OCR Request",
      "type": "n8n-nodes-base.code",
      "position": [
        832,
        416
      ],
      "parameters": {
        "jsCode": "// Convert binary to base64 and build Mistral OCR request body\nconst item = $input.first();\nconst binaryData = item.binary.data;\nconst mimeType = item.json.mimeType;\nconst filename = item.json.filename;\n\n// Get base64 from binary data (n8n stores it in .data property)\nconst base64Data = binaryData.data;\n\n// Build data URL\nconst dataUrl = `data:${mimeType};base64,${base64Data}`;\n\n// Determine document type and build request body for Mistral API\nconst isPdf = mimeType === 'application/pdf' || \n              mimeType === 'application/vnd.openxmlformats-officedocument.wordprocessingml.document' ||\n              mimeType === 'application/vnd.openxmlformats-officedocument.presentationml.presentation';\n\nlet requestBody;\nif (isPdf) {\n  requestBody = {\n    model: 'mistral-ocr-latest',\n    document: {\n      type: 'document_url',\n      document_url: dataUrl\n    }\n  };\n} else {\n  requestBody = {\n    model: 'mistral-ocr-latest',\n    document: {\n      type: 'image_url',\n      image_url: dataUrl\n    }\n  };\n}\n\nreturn [{\n  json: {\n    filename: filename,\n    mimeType: mimeType,\n    requestBody: requestBody\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "681609e2-c494-4be6-8899-22ec68af4e9c",
      "name": "Mistral OCR",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        992,
        416
      ],
      "parameters": {
        "url": "https://api.mistral.ai/v1/ocr",
        "method": "POST",
        "options": {
          "timeout": 120000
        },
        "jsonBody": "={{ JSON.stringify($json.requestBody) }}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth"
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "7ea867d4-60ca-4877-a330-2c3bdfe148b3",
      "name": "Format OCR Text",
      "type": "n8n-nodes-base.code",
      "position": [
        1152,
        416
      ],
      "parameters": {
        "jsCode": "// Prepare OCR text for extraction\nconst ocrResult = $input.first().json;\nconst originalData = $('Prepare OCR Request').first().json;\n\nlet fullText = '';\nlet pageCount = 1;\n\nif (ocrResult.pages && Array.isArray(ocrResult.pages)) {\n  fullText = ocrResult.pages.map(page => page.markdown || '').join('\\n\\n');\n  pageCount = ocrResult.pages.length;\n} else if (ocrResult.text) {\n  fullText = ocrResult.text;\n}\n\nif (!fullText || fullText.trim() === '') {\n  fullText = 'No text extracted from document';\n}\n\nreturn [{\n  json: {\n    ocrText: fullText,\n    filename: originalData.filename,\n    pageCount: pageCount\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "116842fd-81f1-4101-868e-c16ba286bd90",
      "name": "OpenAI Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        1360,
        560
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "478eb4e0-8b77-436b-9d77-7520db07db9f",
      "name": "Extract Invoice Fields",
      "type": "@n8n/n8n-nodes-langchain.informationExtractor",
      "position": [
        1360,
        416
      ],
      "parameters": {
        "text": "={{ $json.ocrText }}",
        "options": {
          "systemPromptTemplate": "You are an invoice data extractor. Extract the requested fields precisely from the OCR text provided. Use dots for decimal separators (not commas). Return null for any field you cannot find. Format dates as YYYY-MM-DD."
        },
        "attributes": {
          "attributes": [
            {
              "name": "invoice_number",
              "description": "The invoice number or document ID"
            },
            {
              "name": "invoice_date",
              "description": "Invoice date in YYYY-MM-DD format"
            },
            {
              "name": "vendor_name",
              "description": "Vendor/supplier company name"
            },
            {
              "name": "vendor_tax_id",
              "description": "Tax ID, VAT, RFC, or business registration number"
            },
            {
              "name": "subtotal",
              "type": "number",
              "description": "Subtotal before taxes (number only, no currency symbol)"
            },
            {
              "name": "tax_rate",
              "type": "number",
              "description": "Tax rate percentage (e.g., 16 or 21)"
            },
            {
              "name": "tax_amount",
              "type": "number",
              "description": "Total tax amount (number only)"
            },
            {
              "name": "total_amount",
              "type": "number",
              "description": "Total amount including taxes (number only)"
            },
            {
              "name": "currency",
              "description": "Currency code (EUR, USD, MXN, GBP, etc.)"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "ace27530-d7a9-48c5-bae5-7b64e10737bc",
      "name": "Validate & Score",
      "type": "n8n-nodes-base.code",
      "position": [
        1648,
        416
      ],
      "parameters": {
        "jsCode": "// Validate extracted data and assign confidence score\nconst extracted = $input.first().json;\nconst metadata = $('Format OCR Text').first().json;\n\n// Required fields for confidence calculation\nconst requiredFields = ['invoice_number', 'invoice_date', 'vendor_name', 'total_amount'];\nlet filledCount = 0;\nlet issues = [];\n\nrequiredFields.forEach(field => {\n  const value = extracted[field];\n  if (value !== null && value !== undefined && value !== '' && value !== 0) {\n    filledCount++;\n  } else {\n    issues.push(`Missing: ${field.replace('_', ' ')}`);\n  }\n});\n\n// Assign confidence level\nlet confidence, status;\nif (filledCount >= 4) {\n  confidence = 'high';\n  status = 'Processed';\n} else if (filledCount >= 2) {\n  confidence = 'medium';\n  status = 'Review';\n} else {\n  confidence = 'low';\n  status = 'Error';\n}\n\nreturn [{\n  json: {\n    invoice_number: extracted.invoice_number || '',\n    invoice_date: extracted.invoice_date || '',\n    vendor_name: extracted.vendor_name || '',\n    vendor_tax_id: extracted.vendor_tax_id || '',\n    subtotal: extracted.subtotal || 0,\n    tax_rate: extracted.tax_rate || 0,\n    tax_amount: extracted.tax_amount || 0,\n    total_amount: extracted.total_amount || 0,\n    currency: extracted.currency || '',\n    filename: metadata.filename,\n    confidence: confidence,\n    status: status,\n    issues: issues.length > 0 ? issues.join('; ') : 'None',\n    processed_at: new Date().toISOString(),\n    page_count: metadata.pageCount\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "5c5cc09b-5b67-424f-b96f-b6ffc0ae3f17",
      "name": "Save to Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1840,
        416
      ],
      "parameters": {
        "columns": {
          "value": {
            "Pages": "={{ $json.page_count }}",
            "Issues": "={{ $json.issues }}",
            "Status": "={{ $json.status }}",
            "Currency": "={{ $json.currency }}",
            "Filename": "={{ $json.filename }}",
            "Subtotal": "={{ $json.subtotal }}",
            "Confidence": "={{ $json.confidence }}",
            "Tax Amount": "={{ $json.tax_amount }}",
            "Vendor Name": "={{ $json.vendor_name }}",
            "Invoice Date": "={{ $json.invoice_date }}",
            "Processed At": "={{ $json.processed_at }}",
            "Tax Rate (%)": "={{ $json.tax_rate }}",
            "Total Amount": "={{ $json.total_amount }}",
            "Vendor Tax ID": "={{ $json.vendor_tax_id }}",
            "Invoice Number": "={{ $json.invoice_number }}"
          },
          "mappingMode": "defineBelow"
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "YOUR_GOOGLE_SHEET_ID"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "3ee3a3d3-991d-4183-95af-82dbacecd9f4",
      "name": "Rate Limit",
      "type": "n8n-nodes-base.wait",
      "position": [
        1984,
        416
      ],
      "parameters": {
        "amount": 1
      },
      "typeVersion": 1.1
    }
  ],
  "connections": {
    "Loop": {
      "main": [
        [],
        [
          {
            "node": "Prepare OCR Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Rate Limit": {
      "main": [
        [
          {
            "node": "Loop",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Mistral OCR": {
      "main": [
        [
          {
            "node": "Format OCR Text",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Files": {
      "main": [
        [
          {
            "node": "Loop",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Model": {
      "ai_languageModel": [
        [
          {
            "node": "Extract Invoice Fields",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Save to Sheets": {
      "main": [
        [
          {
            "node": "Rate Limit",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format OCR Text": {
      "main": [
        [
          {
            "node": "Extract Invoice Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Validate & Score": {
      "main": [
        [
          {
            "node": "Save to Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Invoice Upload Form": {
      "main": [
        [
          {
            "node": "Split Files",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Prepare OCR Request": {
      "main": [
        [
          {
            "node": "Mistral OCR",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Invoice Fields": {
      "main": [
        [
          {
            "node": "Validate & Score",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}