{
  "id": "CEyPK3FFnZ8Y66GV",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "AI Company Data Enrichment Pipeline",
  "tags": [],
  "nodes": [
    {
      "id": "69da91e3-bcf5-4619-b6e3-d9724284f905",
      "name": "\ud83d\udccb Setup Instructions",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -192,
        208
      ],
      "parameters": {
        "color": 5,
        "width": 420,
        "height": 588,
        "content": "## \ud83c\udfd7\ufe0f SETUP REQUIRED\n\n**Before running this workflow:**\n\n1. **Source Sheet**: Connect your Google Sheets credential and set the `Get Companies` node to point to your input spreadsheet.\n   - Required columns: `Company Name`, `Website`, `Enriched`\n   - Leave `Enriched` blank for new rows\n\n2. **Output Sheet**: Set the `Append Enriched Row` node to your destination spreadsheet.\n   - Required columns: `Company Name`, `Website`, `Industry`, `Description`, `Target Customers`, `Headquarters`, `Estimated Size`\n\n3. **AI Model**: Connect your Google Gemini API credential in the `Gemini Flash` node.\n\n4. **Search Tool**: Connect your SerpAPI credential in the `Web Search (SerpAPI)` node.\n\n5. **Mark as Done**: Set this node to your source spreadsheet (same as step 1) to update the `Enriched` column to `Yes`.\n\n---\n\ud83d\udca1 *Tip: Duplicate your source sheet before first run.*"
      },
      "typeVersion": 1
    },
    {
      "id": "04fd3c13-9559-42ee-913d-85cf6459346b",
      "name": "\ud83d\udccb Section: Read & Filter",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        304,
        368
      ],
      "parameters": {
        "color": 7,
        "width": 520,
        "height": 196,
        "content": "## \u2460 READ & FILTER\n\nReads all companies from your source Google Sheet, extracts only the relevant fields (`Company Name`, `Website`, `Enriched`), then skips any row already marked `Enriched = Yes`.\n\n**Customise:** Change the sheet/tab in `Read Source Sheet`."
      },
      "typeVersion": 1
    },
    {
      "id": "1f4cf9ae-8bbc-4f7b-b305-873a0a705e65",
      "name": "\ud83d\udccb Section: AI Enrichment",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1008,
        320
      ],
      "parameters": {
        "color": 6,
        "width": 520,
        "height": 252,
        "content": "## \u2461 AI ENRICHMENT\n\nFor each unenriched company, the AI Agent:\n- Searches the web via SerpAPI\n- Extracts: industry, description, target customers, HQ, and estimated size\n- Returns structured JSON via the Output Parser\n\n**Customise:** Swap `Gemini Flash` for any LangChain-compatible model (OpenAI, Claude, etc.)."
      },
      "typeVersion": 1
    },
    {
      "id": "a1e14b76-848e-4a34-a209-a19d6bf236fe",
      "name": "\ud83d\udccb Section: Write Output",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1568,
        368
      ],
      "parameters": {
        "color": 4,
        "width": 520,
        "height": 184,
        "content": "## \u2462 WRITE OUTPUT\n\nCleans the enriched data (converts the `target_customers` array to a comma-separated string), appends a new row to the output sheet, then marks the source row as `Enriched = Yes` to prevent duplicate processing.\n\n**Customise:** Add or remove columns in `Append Enriched Row`."
      },
      "typeVersion": 1
    },
    {
      "id": "b9585aca-aebd-4284-ad1a-d9e5d364bd52",
      "name": "\u25b6 Run Enrichment",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        304,
        608
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "47fd03d7-936d-4583-bbca-ad486e89bf7f",
      "name": "Read Source Sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        512,
        608
      ],
      "parameters": {
        "options": {},
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "YOUR_SOURCE_SHEET_ID"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "8235c4cf-ba6a-45e5-b7b6-022152c634fe",
      "name": "Extract Relevant Fields",
      "type": "n8n-nodes-base.set",
      "position": [
        736,
        608
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "field-company-name",
              "name": "Company Name",
              "type": "string",
              "value": "={{ $json[\"Company Name\"] }}"
            },
            {
              "id": "field-website",
              "name": "Website",
              "type": "string",
              "value": "={{ $json.Website }}"
            },
            {
              "id": "field-enriched",
              "name": "Enriched",
              "type": "string",
              "value": "={{ $json.Enriched }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "c331d771-b80a-4c41-bda7-c7754ddd1869",
      "name": "Skip Already Enriched",
      "type": "n8n-nodes-base.if",
      "position": [
        912,
        608
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": false,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "check-not-enriched",
              "operator": {
                "type": "string",
                "operation": "notEquals"
              },
              "leftValue": "={{ $json.Enriched }}",
              "rightValue": "Yes"
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "5c45a5f6-803f-485e-aba3-c1b032b999a6",
      "name": "AI Enrichment Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "maxTries": 3,
      "position": [
        1152,
        592
      ],
      "parameters": {
        "text": "=Company Name: {{ $json[\"Company Name\"] }}\nWebsite URL: {{ $json.Website }}",
        "options": {
          "systemMessage": "You are a company data enrichment engine.\n\nYour task is to research the provided company using its name and website, then return ONLY valid JSON matching the required schema.\n\nRules:\n1. Return ONLY JSON \u2014 no markdown, no explanations, no extra text.\n2. Use exact field names from the schema.\n3. If a value cannot be determined with confidence, return an empty string (\"\").\n4. Keep descriptions under 20 words.\n5. target_customers must always be an array of strings.\n6. headquarters must be formatted as: \"City, Country\".\n7. estimated_size must be exactly one of: Startup | Small | Medium | Large | Enterprise\n8. industry must be exactly one of: AI | Software | Cybersecurity | FinTech | HealthTech | EdTech | E-commerce | Marketing | Manufacturing | Consulting | Logistics | Telecommunications | Media | Real Estate | Energy | Biotechnology | Finance | Healthcare | Retail | Other\n9. target_customers must only contain values from: Businesses | Enterprises | Startups | Developers | Researchers | Consumers | Government | Educational Institutions | Healthcare Providers | Financial Institutions | Nonprofits | Tech Companies | Small Businesses\n10. Always populate company_name and website from the input exactly as given.\n11. Never invent or hallucinate data. Research first, then respond."
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "retryOnFail": true,
      "typeVersion": 3,
      "waitBetweenTries": 2000
    },
    {
      "id": "95e8f953-5229-4ae3-ad60-8904a0fb1891",
      "name": "Structured Output Parser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        1392,
        832
      ],
      "parameters": {
        "jsonSchemaExample": "{\n  \"company_name\": \"OpenAI\",\n  \"website\": \"openai.com\",\n  \"industry\": \"AI\",\n  \"description\": \"Develops AI models and products for businesses and consumers\",\n  \"target_customers\": [\n    \"Businesses\",\n    \"Developers\"\n  ],\n  \"headquarters\": \"San Francisco, United States\",\n  \"estimated_size\": \"Large\"\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "bdb816a4-d9e2-4ad1-ba66-270b10d6c71e",
      "name": "Web Search (SerpAPI)",
      "type": "@n8n/n8n-nodes-langchain.toolSerpApi",
      "position": [
        1216,
        832
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "5c6c2bc6-1013-4951-8062-1347ff613a49",
      "name": "Gemini Flash",
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "position": [
        1056,
        832
      ],
      "parameters": {
        "options": {},
        "modelName": "models/gemini-2.0-flash"
      },
      "typeVersion": 1
    },
    {
      "id": "e3ad40be-8f07-441b-889a-812ff53ecaa8",
      "name": "Normalize Output Data",
      "type": "n8n-nodes-base.code",
      "position": [
        1472,
        592
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// Safely join target_customers array to a comma-separated string\nconst output = $json.output || {};\nconst targets = Array.isArray(output.target_customers)\n  ? output.target_customers.join(\", \")\n  : (output.target_customers || \"\");\n\nreturn {\n  company_name:    output.company_name    || \"\",\n  website:         output.website         || \"\",\n  industry:        output.industry        || \"\",\n  description:     output.description     || \"\",\n  target_customers: targets,\n  headquarters:    output.headquarters    || output.headquaters || \"\",\n  estimated_size:  output.estimated_size  || \"\"\n};\n"
      },
      "typeVersion": 2
    },
    {
      "id": "999f2a2a-9459-477a-9797-d760c258406e",
      "name": "Append Enriched Row",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1664,
        592
      ],
      "parameters": {
        "columns": {
          "value": {
            "Website": "={{ $json.website }}",
            "Industry": "={{ $json.industry }}",
            "Description": "={{ $json.description }}",
            "Company Name": "={{ $json.company_name }}",
            "Headquarters": "={{ $json.headquarters }}",
            "Estimated Size": "={{ $json.estimated_size }}",
            "Target Customers": "={{ $json.target_customers }}"
          },
          "schema": [
            {
              "id": "Company Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Company Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Website",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Website",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Industry",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Industry",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Description",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Description",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Target Customers",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Target Customers",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Headquarters",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Headquarters",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Estimated Size",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Estimated Size",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Company Name"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Enriched"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "YOUR_OUTPUT_SHEET_ID"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "56044593-f73d-4e87-86c3-9b2291e4baf3",
      "name": "Mark as Enriched",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1856,
        592
      ],
      "parameters": {
        "columns": {
          "value": {
            "Enriched": "Yes",
            "Company Name": "={{ $json.company_name }}"
          },
          "schema": [
            {
              "id": "Company Name",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Company Name",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Enriched",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Enriched",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "Company Name"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "update",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "YOUR_SOURCE_SHEET_ID"
        }
      },
      "typeVersion": 4.5
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "19fd4c5c-fc60-45b1-9a5b-6c3971453e99",
  "connections": {
    "Gemini Flash": {
      "ai_languageModel": [
        [
          {
            "node": "AI Enrichment Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Read Source Sheet": {
      "main": [
        [
          {
            "node": "Extract Relevant Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\u25b6 Run Enrichment": {
      "main": [
        [
          {
            "node": "Read Source Sheet",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "AI Enrichment Agent": {
      "main": [
        [
          {
            "node": "Normalize Output Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Append Enriched Row": {
      "main": [
        [
          {
            "node": "Mark as Enriched",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Web Search (SerpAPI)": {
      "ai_tool": [
        [
          {
            "node": "AI Enrichment Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Normalize Output Data": {
      "main": [
        [
          {
            "node": "Append Enriched Row",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Skip Already Enriched": {
      "main": [
        [
          {
            "node": "AI Enrichment Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Relevant Fields": {
      "main": [
        [
          {
            "node": "Skip Already Enriched",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser": {
      "ai_outputParser": [
        [
          {
            "node": "AI Enrichment Agent",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    }
  }
}