AutomationFlowsEmail & Gmail › Extract Structured Data From Gmail Attachments to Google Sheets, Gpt Vision

Extract Structured Data From Gmail Attachments to Google Sheets, Gpt Vision

ByMasaki Go @pippi on n8n.io

Automatically extract structured information from emails using AI-powered document analysis. This workflow processes emails from specified domains, classifies them by type, and extracts structured data from various attachment formats.

Cron / scheduled trigger★★★★★ complexity48 nodesGmailHTTP RequestAWS S3Google Sheets
Email & Gmail Trigger: Cron / scheduled Nodes: 48 Complexity: ★★★★★ Added:

This workflow corresponds to n8n.io template #13004 — we link there as the canonical source.

This workflow follows the Gmail → Google Sheets recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Extract structured data from Gmail attachments to Google Sheets using GPT-4o Vision",
  "nodes": [
    {
      "id": "762adf7a-01c4-44f8-800b-b084812cf0c3",
      "name": "Workflow Description",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2384,
        1216
      ],
      "parameters": {
        "width": 380,
        "height": 776,
        "content": "##  Email Attachment Processor Template\n\nAutomatically extract structured information from emails using AI.\n\n### What this workflow does:\n1. **Monitors Gmail** for emails from specified domains\n2. **Classifies emails** into custom categories\n3. **Processes attachments** (Excel, PPTX, PDF, images) using GPT Vision\n4. **Extracts structured data** based on your schema\n5. **Saves to Google Sheets** with classification\n6. **Labels processed emails** in Gmail\n7. **Sends to custom API** (optional)\n8. **Notifies via Slack** (optional)\n\n### Requirements:\n- Gmail OAuth2 credentials\n- OpenAI API key (for GPT-4 Vision)\n- Google Sheets OAuth2 credentials\n- AWS S3 bucket (for image processing)\n- ConvertAPI account (for PPTX/PDF conversion)\n- Slack OAuth2 (optional)\n\n### Setup:\n1. Configure all credentials\n2. Update Google Sheets document ID in Variables\n3. Update S3 bucket name in Variables\n4. Customize domain filter in 'Domain Filter' node\n5. Customize classification keywords as needed"
      },
      "typeVersion": 1
    },
    {
      "id": "1a385d9d-4e9a-4a56-beb3-4d3e7dd9649b",
      "name": "Email Classification Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3040,
        1184
      ],
      "parameters": {
        "color": 7,
        "width": 332,
        "height": 336,
        "content": "###  Email Type Detection\n\n**Customize keywords for your use case:**\n\n**Category A (e.g., Cancellation):**\n- Add your keywords here\n\n**Category B (e.g., Updates):**\n- Add your keywords here\n\n**Category C (e.g., New Items):**\n- Default category\n\nCustomize keywords in the 'Domain Filter' code node."
      },
      "typeVersion": 1
    },
    {
      "id": "579e95c7-8c34-4446-a229-061652df9808",
      "name": "Gmail Labels Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        6560,
        1264
      ],
      "parameters": {
        "color": 3,
        "width": 280,
        "height": 216,
        "content": "###  Gmail Labels\n\nProcessed emails are labeled:\n- Processed_CategoryA\n- Processed_CategoryB\n- Processed_CategoryC\n\nLabels are created automatically if they don't exist.\n\n**Customize label names in the code nodes.**"
      },
      "typeVersion": 1
    },
    {
      "id": "e852a330-e3ae-4b4f-b887-eec29afafa25",
      "name": "File Processing Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3696,
        1152
      ],
      "parameters": {
        "color": 7,
        "width": 300,
        "height": 344,
        "content": "###  File Processing Routes\n\n**Excel files (Category C):**\n\u2192 Direct data extraction\n\n**Excel files (Category A/B):**\n\u2192 GPT Vision analysis\n\n**PPTX/PDF files:**\n\u2192 Convert to images \u2192 GPT Vision\n\n**Image attachments:**\n\u2192 Direct GPT Vision analysis\n\n**Text-only emails:**\n\u2192 GPT text analysis"
      },
      "typeVersion": 1
    },
    {
      "id": "eb599c2d-491a-4824-85b9-a503a4b65c47",
      "name": "ConvertAPI Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        4176,
        1760
      ],
      "parameters": {
        "color": 3,
        "width": 300,
        "height": 284,
        "content": "###  ConvertAPI Setup\n\nUsed for converting PPTX/PDF to images for GPT Vision analysis.\n\nhttps://www.convertapi.com\n\n*This service is paid but offers free initial conversions.*\n\nReplace the HTTP Header Auth credential with your ConvertAPI secret."
      },
      "typeVersion": 1
    },
    {
      "id": "5b69b7ca-b4e1-42d3-ab8b-cfb915f745a1",
      "name": "Google Sheets Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        6128,
        1136
      ],
      "parameters": {
        "color": 3,
        "width": 300,
        "height": 368,
        "content": "###  Google Sheets Output\n\nExtracted data is saved with columns:\n- email_type (Category)\n- field_1 (Date)\n- field_2 (Time)\n- field_3 (Name)\n- field_4 (Amount)\n- And more...\n\n**Customize the output schema in the 'Parse All Results' node.**\n\n**Update the Document ID** in Variables or directly in the Google Sheets node."
      },
      "typeVersion": 1
    },
    {
      "id": "9a32ec51-62fb-49c8-9941-60bf6a1adc13",
      "name": "AWS S3 Info",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        4208,
        1024
      ],
      "parameters": {
        "color": 7,
        "width": 300,
        "height": 288,
        "content": "###  AWS S3 Setup\n\nImages are temporarily uploaded to S3 for GPT Vision processing.\n\n**Required:**\n1. Create an S3 bucket\n2. Configure AWS credentials in n8n\n3. Update bucket name in Variables\n\nImages are used for GPT Vision API calls and can be cleaned up periodically."
      },
      "typeVersion": 1
    },
    {
      "id": "61bd3fd0-9fdd-4cf0-9edf-28721a2358da",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        2832,
        1536
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "minutes",
              "minutesInterval": 30
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "11ce6b59-1436-4ef5-a0ef-71f9aa98c763",
      "name": "Gmail - Get Emails",
      "type": "n8n-nodes-base.gmail",
      "position": [
        3040,
        1536
      ],
      "parameters": {
        "limit": 1,
        "filters": {
          "q": "-label:Processed_CategoryA -label:Processed_CategoryB -label:Processed_CategoryC",
          "receivedAfter": "={{ $now.minus({ days: 60 }).toISO() }}"
        },
        "operation": "getAll"
      },
      "typeVersion": 2.1
    },
    {
      "id": "a6dff01e-96d5-4c57-8def-5bd8c1216c26",
      "name": "Domain Filter",
      "type": "n8n-nodes-base.code",
      "position": [
        3264,
        1536
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// DOMAIN FILTER & EMAIL CLASSIFICATION\n// ============================================================\n// Customize this node for your specific use case\n// ============================================================\n\nconst json = $input.item.json;\nconst subject = json.subject || json.Subject || '';\nconst snippet = json.snippet || '';\nconst fromRaw = json.from || json.From || '';\nconst fromText = typeof fromRaw === 'object' ? (fromRaw.text || JSON.stringify(fromRaw)) : String(fromRaw);\nconst text = (subject + ' ' + snippet + ' ' + fromText).toLowerCase();\n\n// ============================================================\n// STEP 1: Configure your allowed domains\n// ============================================================\n// Add the domains you want to monitor (without @ symbol)\n// Example: ['company.com', 'partner.co.jp', 'supplier.net']\nconst allowedDomains = [\n  // Add your domains here\n  // 'example.com',\n  // 'example.co.jp',\n];\n\n// Check if email is from an allowed domain\nconst fromLower = fromText.toLowerCase();\nconst isFromAllowedDomain = allowedDomains.length === 0 || allowedDomains.some(d => fromLower.includes(d));\n\n// ============================================================\n// STEP 2: Configure classification keywords\n// ============================================================\n\n// Category A keywords (e.g., Cancellations, Deletions)\nconst categoryAKeywords = [\n  // Add your Category A keywords here\n  // 'cancel', 'cancelled', 'delete', 'remove'\n];\n\n// Category B keywords (e.g., Updates, Changes)\nconst categoryBKeywords = [\n  // Add your Category B keywords here\n  // 'update', 'change', 'modify', 'reschedule'\n];\n\n// Category C keywords (e.g., New items - this is the default)\nconst categoryCKeywords = [\n  // Add your Category C keywords here (optional)\n  // 'new', 'offer', 'proposal', 'available'\n];\n\n// ============================================================\n// STEP 3: Configure identifier codes (optional)\n// ============================================================\n// If your emails contain specific codes like [ABC] or \u3010XYZ\u3011\nconst identifierCodes = [\n  // Add your identifier codes here (lowercase)\n  // 'abc', 'xyz', 'code1'\n];\n\n// Normalize full-width characters to ASCII\nconst normalizeAscii = value => String(value || '').replace(/[\uff21-\uff3a\uff10-\uff19]/g, ch => String.fromCharCode(ch.charCodeAt(0) - 0xFEE0));\nconst textUpper = normalizeAscii(subject + ' ' + snippet + ' ' + fromText).toUpperCase();\n\n// Check for identifier codes in brackets\nconst hasIdentifierCode = identifierCodes.some(c => \n  textUpper.includes('\u3010' + c.toUpperCase() + '\u3011') || \n  textUpper.includes('[' + c.toUpperCase() + ']')\n);\n\n// ============================================================\n// Classification Logic\n// ============================================================\n\n// Skip if not from allowed domain and no identifier match\nif (!isFromAllowedDomain && !hasIdentifierCode) {\n  return { json: { _skip: true } };\n}\n\n// Check if email matches any category keywords\nconst isCategoryA = categoryAKeywords.some(k => text.includes(k.toLowerCase()));\nconst isCategoryB = categoryBKeywords.some(k => text.includes(k.toLowerCase()));\nconst hasCategoryC = categoryCKeywords.length === 0 || categoryCKeywords.some(k => text.includes(k.toLowerCase()));\n\n// If no keywords configured and domain doesn't match, skip\nif (categoryAKeywords.length === 0 && categoryBKeywords.length === 0 && categoryCKeywords.length === 0) {\n  // Process all emails from allowed domains when no keywords configured\n  if (!isFromAllowedDomain) {\n    return { json: { _skip: true } };\n  }\n}\n\n// Determine email type/category\nlet emailType = 'CategoryC'; // Default\nif (isCategoryA) emailType = 'CategoryA';\nelse if (isCategoryB) emailType = 'CategoryB';\n\nreturn { json: { ...json, _skip: false, _email_type: emailType } };"
      },
      "typeVersion": 2
    },
    {
      "id": "e303d6d7-5c68-4fc6-97d4-4e5e6c171472",
      "name": "Filter Valid Emails",
      "type": "n8n-nodes-base.filter",
      "position": [
        3488,
        1536
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "notEquals"
              },
              "leftValue": "={{ $json._skip }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "6a426693-2bfc-4d4a-a17a-9f89973f9dc9",
      "name": "Gmail - Get Attachments",
      "type": "n8n-nodes-base.gmail",
      "position": [
        3712,
        1536
      ],
      "parameters": {
        "simple": false,
        "options": {
          "downloadAttachments": true,
          "dataPropertyAttachmentsPrefixName": "attachment_"
        },
        "messageId": "={{ $json.id }}",
        "operation": "get"
      },
      "typeVersion": 2.1
    },
    {
      "id": "7ce1026b-c28f-4e63-95fc-8dcb61814c3f",
      "name": "Detect File Types",
      "type": "n8n-nodes-base.code",
      "position": [
        3920,
        1536
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// FILE TYPE DETECTION\n// ============================================================\n// Identifies attachment types and preserves metadata\n// ============================================================\n\nconst item = $input.item;\nconst binary = item.binary || {};\nconst json = item.json;\n\nconst subject = json.Subject || json.subject || '';\nconst fromData = json.From || json.from || '';\n\nlet emailType = json._email_type || '';\nif (!emailType) {\n  try {\n    const currentId = json.id;\n    const filteredItems = $('Filter Valid Emails').all();\n    for (const f of filteredItems) {\n      if (f.json.id === currentId && f.json._email_type) {\n        emailType = f.json._email_type;\n        break;\n      }\n    }\n  } catch(e) {}\n}\nemailType = emailType || 'CategoryC';\n\nlet fileInfo = { excel: null, pptx: null, pdf: null, image: null };\nlet attachments = [];\n\nfor (const key of Object.keys(binary)) {\n  const bin = binary[key];\n  if (!bin || !bin.fileName) continue;\n  const fn = bin.fileName.toLowerCase();\n  const mime = (bin.mimeType || '').toLowerCase();\n  attachments.push({ filename: bin.fileName, mimeType: bin.mimeType, key: key });\n  \n  if (fn.endsWith('.xlsx') || fn.endsWith('.xls')) fileInfo.excel = { key, fileName: bin.fileName };\n  else if (fn.endsWith('.pptx') || fn.endsWith('.ppt')) fileInfo.pptx = { key, fileName: bin.fileName };\n  else if (fn.endsWith('.pdf') || mime.includes('pdf')) fileInfo.pdf = { key, fileName: bin.fileName };\n  else if (fn.endsWith('.png') || fn.endsWith('.jpg') || fn.endsWith('.jpeg') || mime.startsWith('image/')) fileInfo.image = { key, fileName: bin.fileName, mimeType: bin.mimeType || 'image/png' };\n}\n\n// Variable format types need GPT Vision even for Excel\nconst isVariableFormat = emailType === 'CategoryA' || emailType === 'CategoryB';\nconst hasExcelDirect = fileInfo.excel !== null && !isVariableFormat;\nconst hasFileConvert = fileInfo.pptx !== null || fileInfo.pdf !== null || (fileInfo.excel !== null && isVariableFormat);\n\nlet convType = null, convKey = null;\nif (fileInfo.pptx) { convType = 'pptx'; convKey = fileInfo.pptx.key; }\nelse if (fileInfo.pdf) { convType = 'pdf'; convKey = fileInfo.pdf.key; }\nelse if (fileInfo.excel && isVariableFormat) { convType = 'xlsx'; convKey = fileInfo.excel.key; }\n\nconst output = {\n  ...json,\n  id: json.id,\n  Subject: subject,\n  subject: subject,\n  From: fromData,\n  from: fromData,\n  attachments: attachments,\n  has_excel: hasExcelDirect,\n  has_pptx_pdf: hasFileConvert,\n  has_image: fileInfo.image !== null,\n  excel_binary_key: fileInfo.excel ? fileInfo.excel.key : null,\n  conversion_type: convType,\n  conversion_binary_key: convKey,\n  image_binary_key: fileInfo.image ? fileInfo.image.key : null,\n  _email_type: emailType\n};\n\nreturn {\n  json: output,\n  binary: item.binary\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "64aef15e-2374-4845-acc6-0dfe45d35c8f",
      "name": "IF - Has Excel?",
      "type": "n8n-nodes-base.if",
      "position": [
        4144,
        1536
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "equals"
              },
              "leftValue": "={{ $json.has_excel }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "7c133bed-216b-49fe-a6d9-8e7292391f87",
      "name": "Extract from Excel",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        4368,
        1344
      ],
      "parameters": {
        "options": {},
        "operation": "xlsx",
        "binaryPropertyName": "={{ $json.excel_binary_key || 'attachment_0' }}"
      },
      "typeVersion": 1
    },
    {
      "id": "ae047edd-7921-47cd-84c1-98f9fd5b3ae2",
      "name": "Parse Excel Data",
      "type": "n8n-nodes-base.code",
      "position": [
        5008,
        1216
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// EXCEL DATA EXTRACTION\n// ============================================================\n// Customize this node to parse your specific Excel format\n// ============================================================\n\n// Map codes to full names (customize for your use case)\nconst codeToNameMap = {\n  // 'CODE1': 'Full Name 1',\n  // 'CODE2': 'Full Name 2',\n};\n\nfunction excelTimeToHHMM(fraction) {\n  if (!fraction && fraction !== 0) return '';\n  const totalMinutes = Math.round(fraction * 24 * 60);\n  const hours = Math.floor(totalMinutes / 60);\n  const minutes = totalMinutes % 60;\n  return String(hours).padStart(2, '0') + ':' + String(minutes).padStart(2, '0');\n}\n\nlet excelSourceItems = [];\ntry { excelSourceItems = $('IF - Has Excel?').all(); } catch(e) {}\n\nconst allRows = $input.all().map(i => i.json);\nconst results = [];\n\nlet hintYear = new Date().getFullYear();\nlet hintMonth = new Date().getMonth() + 1;\nlet currentName = '';\nlet currentType = 'Regular';\n\nconst dataBySource = {};\n\nfor (const row of allRows) {\n  const allText = JSON.stringify(row);\n  const firstKey = Object.keys(row)[0] || '';\n  const firstVal = row[firstKey] || '';\n  \n  // Extract year/month from text (customize pattern as needed)\n  const ymMatch = allText.match(/(20\\d{2})[-\\/\u5e74]\\s*(\\d{1,2})/);\n  if (ymMatch) {\n    hintYear = parseInt(ymMatch[1]);\n    hintMonth = parseInt(ymMatch[2]);\n  }\n  \n  // Extract source name from brackets (customize pattern as needed)\n  const nameMatch = allText.match(/\u3010([A-Z]+)[^\u3011]*\u3011/i);\n  if (nameMatch) {\n    const code = nameMatch[1].toUpperCase();\n    currentName = codeToNameMap[code] || code;\n  }\n  \n  // Check for codes in the text\n  for (const code of Object.keys(codeToNameMap)) {\n    if (allText.toUpperCase().includes(code)) {\n      currentName = codeToNameMap[code] || code;\n      break;\n    }\n  }\n  \n  // Skip header rows (customize as needed)\n  if (allText.includes('Header1') && allText.includes('Header2')) continue;\n\n  // ============================================================\n  // CUSTOMIZE: Add your Excel parsing logic here\n  // ============================================================\n  \n  // Example: Parse rows with specific structure\n  if (row.__EMPTY_1 !== undefined && typeof row.__EMPTY_1 === 'number') {\n    const startTime = typeof row.__EMPTY_2 === 'number'\n      ? excelTimeToHHMM(row.__EMPTY_2)\n      : excelTimeToHHMM(row.__EMPTY_1);\n    const endTime = typeof row.__EMPTY_4 === 'number'\n      ? excelTimeToHHMM(row.__EMPTY_4)\n      : excelTimeToHHMM(row.__EMPTY_2);\n\n    if (startTime) {\n      const sourceName = currentName || 'Unknown';\n      const amount = typeof row.__EMPTY_7 === 'number' ? row.__EMPTY_7 : (row.__EMPTY_5 || 0);\n      const dataItem = {\n        date: hintYear + '-' + String(hintMonth).padStart(2,'0') + '-' + String(row.__EMPTY_1).padStart(2,'0'),\n        day_of_week: firstVal,\n        start_time: startTime,\n        end_time: endTime,\n        amount: amount,\n        type: currentType,\n        name: '',\n        source_name: sourceName\n      };\n      if (!dataBySource[sourceName]) dataBySource[sourceName] = [];\n      dataBySource[sourceName].push(dataItem);\n    }\n  }\n}\n\n// Build results\nfor (const srcItem of excelSourceItems) {\n  const meta = srcItem.json;\n  const emailId = meta.id || '';\n  const emailType = meta._email_type || 'CategoryC'; \n  \n  let defaultSource = '';\n  const subjectText = (meta.Subject || meta.subject || '');\n  const sourceMatch = subjectText.match(/\u3010([A-Za-z]+)\u3011/i);\n  if (sourceMatch) defaultSource = codeToNameMap[sourceMatch[1].toUpperCase()] || sourceMatch[1].toUpperCase();\n\n  let dataForEmail = dataBySource[defaultSource] || [];\n\n  if (!defaultSource && dataForEmail.length == 0) {\n    const allSources = Object.keys(dataBySource);\n    if (allSources.length === 1) {\n      dataForEmail = dataBySource[allSources[0]] || [];\n    } else if (allSources.length > 1) {\n      dataForEmail = allSources.flatMap(name => dataBySource[name] || []);\n    }\n  }\n\n  results.push({\n    json: {\n      _meta_subject: meta.Subject || meta.subject,\n      _meta_from: meta.from || meta.From,\n      _meta_id: emailId,\n      email_id: emailId, \n      _meta_source_name: defaultSource,\n      _email_type: emailType,\n      choices: [{ message: { content: JSON.stringify(dataForEmail.length ? dataForEmail : [{notes: '\u26a0\ufe0fNo data found'}]) } }]\n    }\n  });\n}\n\nreturn results.length ? results : [{ json: { _no_data: true } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "edbfd491-dbab-411d-a3d2-c674ee34020d",
      "name": "IF - Has PPTX/PDF?",
      "type": "n8n-nodes-base.if",
      "position": [
        4368,
        1536
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "equals"
              },
              "leftValue": "={{ $json.has_pptx_pdf }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "49b24d00-f787-4b0d-8162-d41f62f6ca0d",
      "name": "ConvertAPI to PNG",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        4592,
        1488
      ],
      "parameters": {
        "url": "=https://v2.convertapi.com/convert/{{ $json.conversion_type }}/to/png",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "File",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "={{ $json.conversion_binary_key }}"
            }
          ]
        },
        "genericAuthType": "httpHeaderAuth"
      },
      "typeVersion": 4.2
    },
    {
      "id": "bdd00767-8b17-4d7f-bc38-588ba5a923b9",
      "name": "Prepare for S3 Upload",
      "type": "n8n-nodes-base.code",
      "position": [
        4800,
        1488
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// PREPARE IMAGES FOR S3 UPLOAD\n// ============================================================\n\nconst allItems = $input.all();\nconst results = [];\n\nlet pptxPdfItems = [];\ntry {\n  const allFileItems = $('Detect File Types').all();\n  pptxPdfItems = allFileItems.filter(item => item.json.has_pptx_pdf === true);\n} catch(e) {}\n\nlet filterItems = [];\ntry { filterItems = $('Filter Valid Emails').all(); } catch(e) {}\n\nfor (let idx = 0; idx < allItems.length; idx++) {\n  const item = allItems[idx];\n  const response = item.json;\n  \n  let emailId='', subject='', fromData='', attachments=[], emailType='';\n  \n  if (pptxPdfItems[idx]) {\n    const meta = pptxPdfItems[idx].json;\n    emailId = meta.id || '';\n    subject = meta.Subject || '';\n    fromData = meta.From || '';\n    attachments = meta.attachments || [];\n    emailType = meta._email_type || '';\n  }\n  \n  if (!emailType && emailId) {\n    for (const f of filterItems) {\n      if (f.json.id === emailId && f.json._email_type) {\n        emailType = f.json._email_type;\n        break;\n      }\n    }\n  }\n  emailType = emailType || 'CategoryC';\n  \n  if (response.Files && response.Files.length > 0) {\n    const maxPages = Math.min(response.Files.length, 15);\n    const ts = Date.now();\n    \n    for (let i = 0; i < maxPages; i++) {\n      const fileData = response.Files[i];\n      if (fileData && fileData.FileData) {\n        const fileName = 'convert-' + emailId + '-page-' + (i+1) + '-' + ts + '.png';\n        results.push({\n          json: {\n            _meta_subject: subject,\n            _meta_from: fromData,\n            _meta_id: emailId,\n            _meta_attachments: attachments,\n            _meta_process_type: 'pptx_pdf',\n            _email_type: emailType,\n            _page_index: i,\n            _page_filename: fileName,\n            _total_pages: maxPages\n          },\n          binary: {\n            data: {\n              data: fileData.FileData,\n              mimeType: 'image/png',\n              fileName: fileName,\n              fileExtension: 'png'\n            }\n          }\n        });\n      }\n    }\n  }\n}\n\nif (results.length === 0) return [{ json: { _skip: true } }];\nreturn results;"
      },
      "typeVersion": 2
    },
    {
      "id": "9a844a79-cb50-439f-abcf-ea9ef20d3510",
      "name": "S3 Upload (PPTX)",
      "type": "n8n-nodes-base.awsS3",
      "position": [
        5024,
        1488
      ],
      "parameters": {
        "fileName": "={{ $json._page_filename }}",
        "operation": "upload",
        "bucketName": "={{ $vars.S3_BUCKET_NAME || 'your-bucket-name' }}",
        "additionalFields": {}
      },
      "typeVersion": 2
    },
    {
      "id": "22f24843-db51-4bc5-aac7-bf9cf8034222",
      "name": "Collect S3 URLs",
      "type": "n8n-nodes-base.code",
      "position": [
        5248,
        1488
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// COLLECT S3 URLs FOR GPT VISION\n// ============================================================\n\nlet prepItems = [];\ntry { prepItems = $('Prepare for S3 Upload').all(); } catch(e) {}\n\nconst bucket = $vars.S3_BUCKET_NAME || 'your-bucket-name';\nconst region = $vars.AWS_REGION || 'us-east-1';\nconst emailGroups = new Map();\n\nfor (const prep of prepItems) {\n  const p = prep.json;\n  const emailId = p._meta_id || '';\n  const filename = p._page_filename || '';\n  if (!emailId || !filename) continue;\n  \n  const s3Url = 'https://' + bucket + '.s3.' + region + '.amazonaws.com/' + filename;\n  \n  if (!emailGroups.has(emailId)) {\n    emailGroups.set(emailId, {\n      subject: p._meta_subject || '',\n      from: p._meta_from || '',\n      emailId: emailId,\n      attachments: p._meta_attachments || [],\n      emailType: p._email_type || 'CategoryC',\n      sourceName: p._meta_source_name || '',\n      urls: []\n    });\n  }\n  emailGroups.get(emailId).urls.push({ url: s3Url, index: p._page_index || 0 });\n}\n\nconst results = [];\nfor (const [emailId, data] of emailGroups) {\n  data.urls.sort((a, b) => a.index - b.index);\n  results.push({\n    json: {\n      _meta_subject: data.subject,\n      _meta_from: data.from,\n      _meta_id: data.emailId,\n      _meta_attachments: data.attachments,\n      _meta_source_name: data.sourceName,\n      _email_type: data.emailType,\n      _s3_urls: data.urls.map(u => u.url)\n    }\n  });\n}\n\nif (results.length === 0) return [{ json: { _skip: true } }];\nreturn results;"
      },
      "typeVersion": 2
    },
    {
      "id": "f92643b3-d9a2-4053-b2ca-8f76294e0e67",
      "name": "Prepare GPT Request (PPTX)",
      "type": "n8n-nodes-base.code",
      "position": [
        5472,
        1488
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// PREPARE GPT VISION REQUEST FOR PPTX/PDF\n// ============================================================\n// Customize the system prompt and extraction fields\n// ============================================================\n\n// Map codes to full names (customize for your use case)\nconst codeToNameMap = {\n  // 'CODE1': 'Full Name 1',\n  // 'CODE2': 'Full Name 2',\n};\n\nconst item = $input.item.json;\nconst subject = item._meta_subject || '';\nconst emailId = item._meta_id || '';\nconst s3Urls = item._s3_urls || [];\nconst emailType = item._email_type || 'CategoryC';\n\n// Extract source name from subject\nconst sourceMatch = subject.match(/\u3010([A-Za-z]+)\u3011/i);\nconst sourceCode = sourceMatch ? sourceMatch[1].toUpperCase() : '';\nconst sourceName = codeToNameMap[sourceCode] || sourceCode;\n\n// Build prompt based on email type\nlet prompt = '';\nif (emailType === 'CategoryA') {\n  prompt = 'Subject: ' + subject + '\\n\\nThis is a CANCELLATION notice. Extract cancellation details.';\n} else if (emailType === 'CategoryB') {\n  prompt = 'Subject: ' + subject + '\\n\\nThis is an UPDATE notice. Extract the changes.';\n} else {\n  prompt = 'Subject: ' + subject + '\\n\\nExtract all relevant information from this image.';\n}\n\nlet userContent = [{ type: 'text', text: prompt }];\nfor (const url of s3Urls) {\n  userContent.push({ type: 'image_url', image_url: { url: url, detail: 'high' } });\n}\n\n// ============================================================\n// CUSTOMIZE: Update the system prompt and fields for your use case\n// ============================================================\nconst systemPrompt = `You are a data extraction expert. Extract structured information accurately from images.\n\n\u3010Fields to extract\u3011\n- date (YYYY-MM-DD format)\n- day_of_week\n- start_time (HH:MM format)\n- end_time (HH:MM format)\n- source_code\n- name\n- amount (numbers only)\n- category\n- type\n- notes\n\n\u3010Output\u3011JSON array only, no explanation`;\n\nconst gptBody = {\n  model: 'gpt-4o',\n  messages: [\n    { role: 'system', content: systemPrompt },\n    { role: 'user', content: userContent }\n  ],\n  max_tokens: 4096,\n  temperature: 0.1\n};\n\nreturn {\n  json: {\n    _meta_subject: subject,\n    _meta_from: item._meta_from,\n    _meta_id: emailId,\n    _meta_attachments: item._meta_attachments || [],\n    _meta_process_type: 'pptx_pdf',\n    _meta_source_name: sourceName,\n    _email_type: emailType,\n    gpt_request_body: JSON.stringify(gptBody)\n  }\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "eeb04c4a-6278-4f95-b7a1-270ea8052e46",
      "name": "GPT Vision Analysis",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        5664,
        1264
      ],
      "parameters": {
        "url": "https://api.openai.com/v1/chat/completions",
        "method": "POST",
        "options": {
          "timeout": 300000
        },
        "jsonBody": "={{ $json.gpt_request_body }}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "openAiApi"
      },
      "typeVersion": 4.2
    },
    {
      "id": "59b8220a-6ea1-42d6-88c4-91225668facf",
      "name": "IF - Has Image?",
      "type": "n8n-nodes-base.if",
      "position": [
        4592,
        1648
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "equals"
              },
              "leftValue": "={{ $json.has_image }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "68b29f63-ef7c-437a-806b-19ae1e9e1be3",
      "name": "Prepare Image Metadata",
      "type": "n8n-nodes-base.code",
      "position": [
        4800,
        1616
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// PREPARE IMAGE METADATA FOR S3 UPLOAD\n// ============================================================\n\nconst item = $input.item;\nconst json = item.json;\nconst subject = json.Subject || '';\nconst emailId = json.id || 'unknown';\n\nlet emailType = json._email_type || '';\nif (!emailType && emailId) {\n  try {\n    const filteredItems = $('Filter Valid Emails').all();\n    for (const f of filteredItems) {\n      if (f.json.id === emailId && f.json._email_type) {\n        emailType = f.json._email_type;\n        break;\n      }\n    }\n  } catch(e) {}\n}\nemailType = emailType || 'CategoryC';\n\nconst timestamp = Date.now();\nconst fileName = 'email-image-' + emailId + '-' + timestamp + '.png';\n\nreturn {\n  json: {\n    _meta_subject: subject,\n    _meta_from: json.From,\n    _meta_id: json.id,\n    _meta_attachments: json.attachments || [],\n    _meta_process_type: 'image',\n    _email_type: emailType,\n    _image_key: json.image_binary_key || 'attachment_0',\n    _s3_filename: fileName\n  },\n  binary: item.binary\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "45f24d85-b9d3-4532-997c-55d6036aaa4a",
      "name": "S3 Upload (Image)",
      "type": "n8n-nodes-base.awsS3",
      "position": [
        5024,
        1648
      ],
      "parameters": {
        "fileName": "={{ $json._s3_filename }}",
        "operation": "upload",
        "bucketName": "={{ $vars.S3_BUCKET_NAME || 'your-bucket-name' }}",
        "additionalFields": {},
        "binaryPropertyName": "={{ $json._image_key }}"
      },
      "typeVersion": 2
    },
    {
      "id": "e6c8c547-bd98-4e4e-9c7f-8efb515349f1",
      "name": "Prepare GPT Request (Image)",
      "type": "n8n-nodes-base.code",
      "position": [
        5248,
        1648
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// PREPARE GPT VISION REQUEST FOR IMAGE ATTACHMENTS\n// ============================================================\n\nconst codeToNameMap = {\n  // 'CODE1': 'Full Name 1',\n};\n\nconst item = $input.item.json;\nlet meta = {};\ntry { meta = $('Prepare Image Metadata').first().json; } catch(e) {}\n\nconst subject = meta._meta_subject || '';\nconst emailId = meta._meta_id || '';\nconst emailType = meta._email_type || 'CategoryC';\n\nconst sourceMatch = subject.match(/\u3010([A-Za-z]+)\u3011/i);\nconst sourceCode = sourceMatch ? sourceMatch[1].toUpperCase() : '';\nconst sourceName = codeToNameMap[sourceCode] || sourceCode;\n\nconst imageUrl = item.Location || '';\nif (!imageUrl) return { json: { _skip: true } };\n\nlet prompt = '';\nif (emailType === 'CategoryA') {\n  prompt = 'Subject: ' + subject + '\\n\\nThis is a CANCELLATION notice.';\n} else if (emailType === 'CategoryB') {\n  prompt = 'Subject: ' + subject + '\\n\\nThis is an UPDATE notice.';\n} else {\n  prompt = 'Subject: ' + subject + '\\n\\nExtract all relevant data from this image.';\n}\n\nconst gptBody = {\n  model: 'gpt-4o',\n  messages: [\n    { role: 'system', content: 'You are a data extraction expert. Extract structured information from images.\\n\\n\u3010Fields\u3011date(YYYY-MM-DD), day_of_week, start_time(HH:MM), end_time(HH:MM), source_code, name, amount(numbers only), category, type, notes\\n\\n\u3010Output\u3011JSON array only' },\n    { role: 'user', content: [{ type: 'text', text: prompt }, { type: 'image_url', image_url: { url: imageUrl, detail: 'high' } }] }\n  ],\n  max_tokens: 4096\n};\n\nreturn {\n  json: {\n    _meta_subject: subject,\n    _meta_from: meta._meta_from || '',\n    _meta_id: emailId,\n    _meta_attachments: meta._meta_attachments || [],\n    _meta_process_type: 'image',\n    _meta_source_name: sourceName,\n    _email_type: emailType,\n    gpt_request_body: JSON.stringify(gptBody)\n  }\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "16937348-70e9-46da-89eb-9ab047956476",
      "name": "GPT Image Analysis",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        5472,
        1760
      ],
      "parameters": {
        "url": "https://api.openai.com/v1/chat/completions",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.gpt_request_body }}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "openAiApi"
      },
      "typeVersion": 4.2
    },
    {
      "id": "3b289245-1499-4dd3-849f-2b39cf04f65a",
      "name": "Prepare GPT Request (Text)",
      "type": "n8n-nodes-base.code",
      "position": [
        4800,
        1840
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// ============================================================\n// PREPARE GPT REQUEST FOR TEXT-ONLY EMAILS\n// ============================================================\n\nconst item = $input.item.json;\nconst subject = item.Subject || '';\nconst emailId = item.id || '';\n\nlet emailType = item._email_type || '';\nif (!emailType && emailId) {\n  try {\n    const filteredItems = $('Filter Valid Emails').all();\n    for (const f of filteredItems) {\n      if (f.json.id === emailId && f.json._email_type) {\n        emailType = f.json._email_type;\n        break;\n      }\n    }\n  } catch(e) {}\n}\nemailType = emailType || 'CategoryC';\n\nconst textBody = item.textPlain || item.text || '';\nlet userPrompt = 'Subject: ' + subject + '\\n\\nBody:\\n' + textBody.substring(0, 3000);\nif (emailType === 'CategoryA') userPrompt = '\u3010CANCELLATION\u3011' + userPrompt;\nelse if (emailType === 'CategoryB') userPrompt = '\u3010UPDATE\u3011' + userPrompt;\n\nconst gptBody = {\n  model: 'gpt-4o-mini',\n  messages: [\n    { role: 'system', content: 'You are a data extraction expert. Extract structured information from email text.\\n\\n\u3010Fields\u3011date, day_of_week, start_time, end_time, source_code, name, amount, category, type, notes\\n\\n\u3010Output\u3011JSON array only' },\n    { role: 'user', content: userPrompt }\n  ],\n  max_tokens: 2048\n};\n\nreturn {\n  json: {\n    _meta_subject: subject,\n    _meta_from: item.From,\n    _meta_id: emailId,\n    _meta_attachments: item.attachments || [],\n    _meta_process_type: 'text',\n    _email_type: emailType,\n    gpt_request_body: JSON.stringify(gptBody)\n  }\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "27d9b48b-9762-43e5-b624-f810651664e3",
      "name": "GPT Text Analysis",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        5024,
        1840
      ],
      "parameters": {
        "url": "https://api.openai.com/v1/chat/completions",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.gpt_request_body }}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "openAiApi"
      },
      "typeVersion": 4.2
    },
    {
      "id": "f0ac5f4d-bf3b-4a45-ab48-b1cf17f43ce1",
      "name": "Merge All Paths",
      "type": "n8n-nodes-base.merge",
      "position": [
        5904,
        1536
      ],
      "parameters": {},
      "typeVersion": 3,
      "alwaysOutputData": true
    },
    {
      "id": "0a1f3790-23fa-4b48-a1c6-857c88758b86",
      "name": "Parse All Results",
      "type": "n8n-nodes-base.code",
      "position": [
        6128,
        1536
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// PARSE ALL GPT RESPONSES AND FORMAT FOR OUTPUT\n// ============================================================\n// Customize the output schema for your Google Sheets\n// ============================================================\n\n// Map codes to full names (customize for your use case)\nconst codeToNameMap = {\n  // 'CODE1': 'Full Name 1',\n  // 'CODE2': 'Full Name 2',\n};\n\nconst allItems = $input.all();\nconst results = [];\n\nlet pptxPrepItems = [], imagePrepItems = [], textPrepItems = [];\ntry { pptxPrepItems = $('Prepare GPT Request (PPTX)').all(); } catch(e) {}\ntry { imagePrepItems = $('Prepare GPT Request (Image)').all(); } catch(e) {}\ntry { textPrepItems = $('Prepare GPT Request (Text)').all(); } catch(e) {}\n\nlet pptxPrepIndex = 0, imagePrepIndex = 0, textPrepIndex = 0;\n\nfor (let i = 0; i < allItems.length; i++) {\n  const item = allItems[i];\n  const data = item.json;\n  \n  const hasMetadata = !!data._meta_id;\n  let subject, from, emailId, attachments, sourceName, emailType;\n  \n  if (hasMetadata) {\n    subject = data._meta_subject || '';\n    from = data._meta_from || '';\n    emailId = data._meta_id || '';\n    attachments = data._meta_attachments || [];\n    sourceName = data._meta_source_name || '';\n    emailType = data._email_type || 'CategoryC';\n  } else {\n    const modelName = data.model || '';\n    const isTextModel = modelName.includes('gpt-4o-mini');\n    let meta = null;\n    if (isTextModel && textPrepIndex < textPrepItems.length) {\n      meta = textPrepItems[textPrepIndex].json;\n      textPrepIndex++;\n    } else if (imagePrepIndex < imagePrepItems.length) {\n      meta = imagePrepItems[imagePrepIndex].json;\n      imagePrepIndex++;\n    } else if (pptxPrepIndex < pptxPrepItems.length) {\n      meta = pptxPrepItems[pptxPrepIndex].json;\n      pptxPrepIndex++;\n    }\n\n    if (meta) {\n      subject = meta._meta_subject || '';\n      from = meta._meta_from || '';\n      emailId = meta._meta_id || '';\n      attachments = meta._meta_attachments || [];\n      sourceName = meta._meta_source_name || '';\n      emailType = meta._email_type || 'CategoryC';\n    } else {\n      subject = ''; from = ''; emailId = ''; attachments = []; sourceName = ''; emailType = 'CategoryC';\n    }\n  }\n  \n  if (from && typeof from === 'object') {\n    from = from.text || from.address || (from.value && from.value[0] ? from.value[0].address : '') || '';\n  }\n  from = String(from || '');\n  \n  let content = '';\n  if (data.choices && data.choices[0] && data.choices[0].message) {\n    content = data.choices[0].message.content || '';\n  }\n  \n  // Check for unreadable content\n  const unreadablePatterns = ['sorry', 'cannot read', 'unable to', 'not able to'];\n  const isUnreadable = unreadablePatterns.some(p => content.toLowerCase().includes(p));\n  \n  if (isUnreadable) {\n    results.push({\n      json: {\n        email_type: emailType,\n        date: '', day_of_week: '', start_time: '', end_time: '',\n        source_name: sourceName, name: '', amount: '0', category: '', type: '',\n        status: 'Needs Review', client: '',\n        notes: '\u26a0\ufe0fFile unreadable - manual review required',\n        created_at: new Date().toISOString(),\n        email_subject: subject, sender: from, email_id: emailId,\n        attachments: attachments.map(a => a.filename || '').join(', '),\n        _unreadable: true\n      }\n    });\n    continue;\n  }\n  \n  // Parse JSON from GPT response\n  let items = [];\n  try {\n    let jsonStr = content;\n    const codeMatch = content.match(/```json\\s*([\\s\\S]*?)\\s*```/);\n    if (codeMatch) jsonStr = codeMatch[1];\n    else {\n      const arrMatch = content.match(/\\[[\\s\\S]*\\]/);\n      if (arrMatch) jsonStr = arrMatch[0];\n    }\n    const parsed = JSON.parse(jsonStr);\n    items = Array.isArray(parsed) ? parsed : [parsed];\n  } catch(e) { \n    if (emailId) {\n      results.push({\n        json: {\n          email_type: emailType,\n          date: '', day_of_week: '', start_time: '', end_time: '',\n          source_name: sourceName, name: '', amount: '0', category: '', type: '',\n          status: 'Needs Review', client: '',\n          notes: '\u26a0\ufe0fParse failed - manual review required',\n          created_at: new Date().toISOString(),\n          email_subject: subject, sender: from, email_id: emailId,\n          attachments: attachments.map(a => a.filename || '').join(', '),\n          _unreadable: true\n        }\n      });\n    }\n    continue;\n  }\n  \n  if (items.length === 0) continue;\n  \n  // ============================================================\n  // CUSTOMIZE: Map extracted data to your output schema\n  // ============================================================\n  for (const s of items) {\n    let sourceNameFinal = sourceName || s.source_name || '';\n    if (s.source_code) {\n      const code = s.source_code.toUpperCase();\n      sourceNameFinal = codeToNameMap[code] || sourceNameFinal || code;\n    }\n    \n    // Determine category from time (customize as needed)\n    let category = s.category || '';\n    if (!category && s.start_time) {\n      const h = parseInt(s.start_time.split(':')[0]) || 0;\n      if (h >= 5 && h < 10) category = 'Morning';\n      else if (h >= 10 && h < 15) category = 'Afternoon';\n      else if (h >= 15 && h < 23) category = 'Evening';\n      else category = 'Night';\n    }\n\n    results.push({\n      json: {\n        email_type: emailType,\n        date: s.date || '',\n        day_of_week: s.day_of_week || '',\n        start_time: s.start_time || '',\n        end_time: s.end_time || '',\n        source_name: sourceNameFinal,\n        name: s.name || '',\n        amount: String(s.amount || 0),\n        category: category,\n        type: s.type || 'Regular',\n        status: 'Pending',\n        client: '',\n        notes: s.notes || '',\n        created_at: new Date().toISOString(),\n        email_subject: subject,\n        sender: from,\n        email_id: emailId,\n        attachments: attachments.map(a => a.filename || '').join(', ')\n      }\n    });\n  }\n}\n\nif (results.length === 0) return [{ json: { _no_data: true } }];\nreturn results;"
      },
      "typeVersion": 2
    },
    {
      "id": "3b91edb4-d360-4441-ac58-bcaf1da85514",
      "name": "Filter Valid Data",
      "type": "n8n-nodes-base.filter",
      "position": [
        6352,
        1536
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "string",
                "operation": "notEmpty",
                "singleValue": true
              },
              "leftValue": "={{ $json[\"email_type\"] }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "07942913-2997-42f5-a93f-b4459ca4c119",
      "name": "Save to Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        6560,
        1536
      ],
      "parameters": {
        "columns": {
          "value": {},
          "mappingMode": "autoMapInputData"
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "={{ $vars.GOOGLE_SHEET_ID || 'your-google-sheet-id' }}"
        }
      },
      "typeVersion": 4.4
    },
    {
      "id": "93650748-aff3-423d-afa6-48dd53b5230b",
      "name": "Extract Email IDs",
      "type": "n8n-nodes-base.code",
      "position": [
        6784,
        1536
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// EXTRACT UNIQUE EMAIL IDs FOR LABELING\n// ============================================================\n\nconst allItems = $input.all();\nconst emailMap = new Map();\n\nfunction addEmail(emailId, emailType, isUnreadable) {\n  if (!emailId) return;\n  if (!emailMap.has(emailId)) {\n    emailMap.set(emailId, { type: isUnreadable ? 'NeedsReview' : emailType });\n  }\n  if (isUnreadable) emailMap.get(emailId).type = 'NeedsReview';\n}\n\nfor (const item of allItems) {\n  const emailId = item.json.email_id;\n  const emailType = item.json.email_type || 'CategoryC';\n  const isUnreadable = item.json._unreadable === true;\n  addEmail(emailId, emailType, isUnreadable);\n}\n\ntry {\n  const filteredItems = $('Filter Valid Emails').all();\n  for (const f of filteredItems) {\n    addEmail(f.json.id, f.json._email_type || 'CategoryC', false);\n  }\n} catch (e) {}\n\nreturn Array.from(emailMap.entries()).map(([id, data]) => ({\n  json: { email_id: id, email_type: data.type }\n}));"
      },
      "typeVersion": 2
    },
    {
      "id": "613a3c8a-e875-48f5-b75b-1b121b92f173",
      "name": "Get Gmail Labels",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        6784,
        1760
      ],
      "parameters": {
        "url": "https://gmail.googleapis.com/gmail/v1/users/me/labels",
        "options": {},
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "gmailOAuth2"
      },
      "typeVersion": 4.2
    },
    {
      "id": "dab2e312-225d-4df4-a45d-4b1ab1a37558",
      "name": "Check Labels",
      "type": "n8n-nodes-base.code",
      "position": [
        7008,
        1760
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// CHECK WHICH LABELS EXIST\n// ============================================================\n// Customize label names for your use case\n// ============================================================\n\nconst response = $input.first().json;\nconst labels = response.labels || [];\n\n// Define your label names here\nconst labelNames = [\n  'Processed_CategoryA',\n  'Processed_CategoryB', \n  'Processed_CategoryC'\n];\n\nconst existingLabels = {};\nconst missingLabels = [];\n\nfor (const name of labelNames) {\n  const found = labels.find(l => l.name === name);\n  if (found) existingLabels[name] = found.id;\n  else missingLabels.push(name);\n}\n\nreturn [{ \n  json: { \n    existing_labels: existingLabels, \n    missing_labels: missingLabels, \n    needs_creation: missingLabels.length > 0 \n  } \n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "f169d45b-e952-4df5-bd93-7e61a4f07980",
      "name": "IF - Need Labels?",
      "type": "n8n-nodes-base.if",
      "position": [
        7232,
        1760
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "equals"
              },
              "leftValue": "={{ $json.needs_creation }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "137c484a-e4ce-4996-b603-4f2164309dc3",
      "name": "Prepare Label Creation",
      "type": "n8n-nodes-base.code",
      "position": [
        7440,
        1664
      ],
      "parameters": {
        "jsCode": "// Prepare label creation requests\nconst items = $input.all();\nconst results = [];\nfor (const item of items) {\n  const data = item.json || {};\n  const missingLabels = data.missing_labels || [];\n  const existingLabels = data.existing_labels || {};\n  if (missingLabels.length === 0) results.push({ json: { label_name: '', existing_labels: existingLabels, _skip_creation: true } });\n  else for (const name of missingLabels) results.push({ json: { label_name: name, existing_labels: existingLabels, _skip_creation: false } });\n}\nif (results.length === 0) return [{ json: { label_name: '', existing_labels: {}, _skip_creation: true } }];\nreturn results;"
      },
      "typeVersion": 2
    },
    {
      "id": "28717cfd-d0dc-4040-b3fa-dd2982e05819",
      "name": "Filter Labels to Create",
      "type": "n8n-nodes-base.filter",
      "position": [
        7664,
        1664
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "notEquals"
              },
              "leftValue": "={{ $json._skip_creation }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "5f7f79dc-0cb8-43b8-b938-6539881c02d8",
      "name": "Create Gmail Label",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        7888,
        1664
      ],
      "parameters": {
        "url": "https://gmail.googleapis.com/gmail/v1/users/me/labels",
        "method": "POST",
        "options": {},
        "jsonBody": "={\"name\": \"{{ $json.label_name }}\", \"labelListVisibility\": \"labelShow\", \"messageListVisibility\": \"show\"}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "gmailOAuth2"
      },
      "typeVersion": 4.2
    },
    {
      "id": "13873411-e770-4f62-a957-f0dd3690b26d",
      "name": "Get New Label IDs",
      "type": "n8n-nodes-base.code",
      "position": [
        8112,
        1664
      ],
      "parameters": {
        "jsCode": "// Collect newly created label IDs\nconst allItems = $input.all();\nconst createdLabels = {};\nfor (const item of allItems) {\n  const data = item.json;\n  if (data.id && data.name) createdLabels[data.name] = data.id;\n}\nlet existingLabels = {};\ntry { const labelCheckData = $('Check Labels').first().json; existingLabels = labelCheckData.existing_labels || {}; } catch(e) {}\nreturn [{ json: { all_labels: { ...existingLabels, ...createdLabels } } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "8ca950ac-7f74-4596-adf7-ca74e6b191d3",
      "name": "Pass Existing Labels",
      "type": "n8n-nodes-base.code",
      "position": [
        7440,
        1856
      ],
      "parameters": {
        "jsCode": "// Pass through existing labels when no creation needed\nconst items = $input.all();\nlet existingLabels = {};\nfor (const item of items) {\n  const data = item.json || {};\n  if (data.existing_labels) existingLabels = { ...existingLabels, ...data.existing_labels };\n}\nreturn [{ json: { all_labels: existingLabels } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "bebcee89-dc11-4bab-8a9c-e72744778b07",
      "name": "Merge Label IDs",
      "type": "n8n-nodes-base.code",
      "position": [
        8112,
        1856
      ],
      "parameters": {
        "jsCode": "// Merge all label IDs\nconst items = $input.all();\nlet allLabels = {};\nfor (const item of items) {\n  const data = item.json || {};\n  const labels = data.all_labels || data.existing_labels || {};\n  allLabels = { ...allLabels, ...labels };\n}\nreturn [{ json: { all_labels: allLabels } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "ec6d8a00-ade2-4047-84fa-2f448cceb979",
      "name": "Prepare Label Application",
      "type": "n8n-nodes-base.code",
      "position": [
        8320,
        1776
      ],
      "parameters": {
        "jsCode": "// ============================================================\n// PREPARE LABEL APPLICATION FOR EACH EMAIL\n// ============================================================\n// Customize the type-to-label mapping\n// ============================================================\n\nconst inputItems = $input.all();\nconst labelInfo = inputItems[0]?.json?.all_labels\n  ? inputItems[0].json\n  : (() => {\n      try { return $('Merge Label IDs').first().json; } catch (e) { return {}; }\n    })();\n\nlet emailItems = [];\ntry { emailItems = $('Extract Email IDs').all(); } catch (e) { emailItems = []; }\nif (emailItems.length === 0) {\n  try {\n    const filteredItems = $('Filter Valid Emails').all();\n    emailItems = filteredItems\n      .map(item => ({\n        json: {\n          email_id: item.json.id || '',\n          email_type: item.json._email_type || 'CategoryC'\n        }\n      }))\n      .filter(item => item.json.email_id);\n  } catch (e) {\n    emailItems = [];\n  }\n}\n\nconst allLabels = labelInfo.all_labels || {};\n\n// Map email types to label names (customize these)\nconst typeToLabel = { \n  'CategoryA': 'Processed_CategoryA', \n  'CategoryB': 'Processed_CategoryB', \n  'CategoryC': 'Processed_CategoryC' \n};\n\nconst results = [];\n\nfor (const item of emailItems) {\n  const emailType = item.json.email_type || 'CategoryC';\n  const labelName = typeToLabel[emailType] || 'Processed_CategoryC';\n  const labelId = allLabels[labelName];\n  if (labelId) {\n    results.push({\n      json: {\n        email_id: item.json.email_id,\n        email_type: emailType,\n        label_name: labelName,\n        label_id: labelId,\n        request_body: JSON.stringify({ addLabelIds: [labelId] })\n      }\n    });\n  }\n}\n\nif (results.length === 0) return [{ json: { _no_emails: true } }];\nreturn results;"
      },
      "typeVersion": 2
    },
    {
      "id": "b6a78bf7-4974-47ee-8500-440c30991c39",
      "name": "Has Emails?",
      "type": "n8n-nodes-base.filter",
      "position": [
        8544,
        1776
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 1,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "operator": {
                "type": "boolean",
                "operation": "notEquals"
              },
              "leftValue": "={{ $json._no_emails }}",
              "rightValue": true
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "f1e17c5b-24d1-4168-9b06-f24be083c53f",
      "name": "Apply Gmail Labels",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        8768,
        1776
      ],
      "parameters": {
        "url": "=https://gmail.googleapis.com/gmail/v1/users/me/messages/{{ $json.email_id }}/modify",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.request_body }}",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "gmailOAuth2"
      },
      "typeVersion": 4.2
    },
    {
      "id": "config-variables-note",
      "name": "Configuration Variables",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2384,
        2040
      ],
      "parameters": {
        "color": 4,
        "width": 500,
        "height": 300,
        "content": "###  Required Variables\n\nSet these in n8n Settings > Variables:\n\n| Variable | Description |\n|----------|-------------|\n| `S3_BUCKET_NAME` | Your AWS S3 bucket name |\n| `AWS_REGION` | AWS region (e.g., us-east-1) |\n| `GOOGLE_SHEET_ID` | Google Sheets document ID |\n\n### Required Credentials\n\n- **Gmail OAuth2** - For email access\n- **OpenAI API** - For GPT Vision\n- **AWS S3** - For image storage\n- **Google Sheets OAuth2** - For data output\n- **ConvertAPI HTTP Header Auth** - For file conversion"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "Has Emails?": {
      "main": [
        [
          {
            "node": "Apply Gmail Labels",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Labels": {
      "main": [
        [
          {
            "node": "IF - Need Labels?",
            "type": "m
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Automatically extract structured information from emails using AI-powered document analysis. This workflow processes emails from specified domains, classifies them by type, and extracts structured data from various attachment formats.

Source: https://n8n.io/workflows/13004/ — original creator credit. Request a take-down →

More Email & Gmail workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Email & Gmail

What This Flow Does

Gmail, Google Sheets, HTTP Request +1
Email & Gmail

This n8n template allows you to automatically monitor your company's budget by comparing live Bexio accounting data against targets defined in Google Sheets, sending automated weekly email reports. It

Google Sheets, HTTP Request, Gmail
Email & Gmail

This workflow streamlines HR outreach by fetching contact data, validating emails, enforcing daily sending limits, and sending personalized emails with attachments, all while logging activity. Read HR

HTTP Request, Gmail, Google Sheets
Email & Gmail

Activate this workflow once and every Friday at 5PM it automatically pulls your week's meeting data from Fireflies, calculates seven metrics, and emails a formatted report to your manager inbox. It tr

HTTP Request, Google Sheets, Gmail
Email & Gmail

This workflow automatically monitors solar energy production every 2 hours by fetching data from the Energidataservice API. If the energy output falls below a predefined threshold, it instantly notifi

HTTP Request, Gmail, Google Sheets +1