AutomationFlowsWeb Scraping › Openai Gpt-3: Company Enrichment From Website Content

Openai Gpt-3: Company Enrichment From Website Content

ByLucas Perret @lempire on n8n.io

Enrich your company lists with OpenAI GPT-3 ↓

Event trigger★★★★☆ complexityAI-powered11 nodesHTTP RequestHtml ExtractOpenAIGoogle Sheets
Web Scraping Trigger: Event Nodes: 11 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #1862 — we link there as the canonical source.

This workflow follows the Google Sheets → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "nodes": [
    {
      "id": "04750e9b-6ce3-401b-89e7-f1f17f3a4a28",
      "name": "When clicking \"Execute Workflow\"",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -180,
        300
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "7a8bb997-5a2d-4ee0-a1ca-bebe9fe32bc2",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        640,
        460
      ],
      "parameters": {
        "url": "=https://www.{{ $node[\"Split In Batches\"].json[\"Domain\"] }}",
        "options": {
          "redirect": {
            "redirect": {
              "followRedirects": true
            }
          }
        }
      },
      "typeVersion": 3,
      "continueOnFail": true
    },
    {
      "id": "6409f0c4-bf93-4a1d-a74c-e294fb39895f",
      "name": "HTML Extract",
      "type": "n8n-nodes-base.htmlExtract",
      "position": [
        820,
        460
      ],
      "parameters": {
        "options": {
          "trimValues": false
        },
        "extractionValues": {
          "values": [
            {
              "key": "body",
              "cssSelector": "html"
            }
          ]
        }
      },
      "typeVersion": 1,
      "continueOnFail": true
    },
    {
      "id": "f45fcc6a-9ccd-43c9-9eaf-1797768e1e62",
      "name": "OpenAI",
      "type": "n8n-nodes-base.openAi",
      "position": [
        1140,
        460
      ],
      "parameters": {
        "prompt": "=This is the content of the website {{ $node[\"Split In Batches\"].json[\"Domain\"] }}:\"{{ $json[\"contentShort\"] }}\"\n\nIn a JSON format:\n\n- Give me the value proposition of the company. In less than 25 words. In English. Casual Tone. Format is: \"[Company Name] helps [target audience] [achieve desired outcome] and [additional benefit]\"\n\n- Give me the industry of the company. (Classify using this industry list: [Agriculture, Arts, Construction, Consumer Goods, Education, Entertainment, Finance, Other, Health Care, Legal, Manufacturing, Media & Communications, Public Administration, Advertisements, Real Estate, Recreation & Travel, Retail, Software, Transportation & Logistics, Wellness & Fitness] if it's ambiguous between Sofware and Consumer Goods, prefer Consumer Goods)\n\n- Guess the target audience of each company.(Classify and choose 1 from this list: [sales teams, marketing teams, HR teams, customer Service teams, consumers, C-levels] Write it in lowercase)\n\n- Tell me if they are B2B or B2C\n\nformat should be:\n{\"value_proposition\": value_proposition,\n\"industry\": industry,\n\"target_audience\": target_audience, \n\"market\": market }\n\nJSON:",
        "options": {
          "topP": 1,
          "maxTokens": 120,
          "temperature": 0
        }
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1,
      "continueOnFail": true
    },
    {
      "id": "8de6c3d4-316f-4e00-a9f5-a4deefce90b3",
      "name": "Merge",
      "type": "n8n-nodes-base.merge",
      "position": [
        1600,
        320
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combinationMode": "mergeByPosition"
      },
      "typeVersion": 2
    },
    {
      "id": "669f888e-1416-4291-a854-07ffbbbfcab1",
      "name": "Clean Content",
      "type": "n8n-nodes-base.code",
      "position": [
        980,
        460
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "if ($input.item.json.body){\n\n\n\n$input.item.json.content = $input.item.json.body.replaceAll('/^\\s+|\\s+$/g', '').replace('/(\\r\\n|\\n|\\r)/gm', \"\").replace(/\\s+/g, ' ')\n\n\n  $input.item.json.contentShort = $input.item.json.content.slice(0, 10000)\n}\n\n\n\n\nreturn $input.item"
      },
      "executeOnce": false,
      "typeVersion": 1,
      "continueOnFail": true,
      "alwaysOutputData": true
    },
    {
      "id": "dbd5f866-2f5e-4adf-b1b5-a27b08c0425a",
      "name": "Update Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1840,
        320
      ],
      "parameters": {
        "options": {},
        "fieldsUi": {
          "values": [
            {
              "column": "Market",
              "fieldValue": "={{ $json[\"market\"] }}"
            },
            {
              "column": "Industry",
              "fieldValue": "={{ $json[\"industry\"] }}"
            },
            {
              "column": "Value Proposition",
              "fieldValue": "={{ $json[\"value_proposition\"] }}"
            },
            {
              "column": "Target Audience",
              "fieldValue": "={{ $json[\"target_audience\"] }}"
            }
          ]
        },
        "operation": "update",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/13h8HPWKha5kZHDeKxAPQvQqAOonof5cgpxzh79tIQfY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/18iZ59I0q2AeElqcEpyJECNlSv4M6iJll9PQzXQkqEUk/edit#gid=0",
          "__regex": "https:\\/\\/(?:drive|docs)\\.google\\.com\\/\\w+\\/d\\/([0-9a-zA-Z\\-_]+)(?:\\/.*|)"
        },
        "valueToMatchOn": "={{ $json[\"Domain\"] }}",
        "columnToMatchOn": "Domain"
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 3
    },
    {
      "id": "f8bf8b70-6070-447b-af22-4d4e1ffe3539",
      "name": "Parse JSON",
      "type": "n8n-nodes-base.code",
      "position": [
        1300,
        460
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "// Add a new field called 'myNewField' to the\n// JSON of the item\n$input.item.json.value_proposition=JSON.parse($input.item.json.text).value_proposition\n\n$input.item.json.industry=JSON.parse($input.item.json.text).industry\n\n$input.item.json.market=JSON.parse($input.item.json.text).market\n\n$input.item.json.target_audience=JSON.parse($input.item.json.text).target_audience\n\nreturn $input.item;"
      },
      "typeVersion": 1
    },
    {
      "id": "2754c6e1-9cf6-47d4-ad97-0797ec9155df",
      "name": "Read Google Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        40,
        300
      ],
      "parameters": {
        "options": {},
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/13h8HPWKha5kZHDeKxAPQvQqAOonof5cgpxzh79tIQfY/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/18iZ59I0q2AeElqcEpyJECNlSv4M6iJll9PQzXQkqEUk/edit#gid=0",
          "__regex": "https:\\/\\/(?:drive|docs)\\.google\\.com\\/\\w+\\/d\\/([0-9a-zA-Z\\-_]+)(?:\\/.*|)"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 3
    },
    {
      "id": "c2b93428-0dcc-4c02-bb81-496c12442284",
      "name": "Split In Batches",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        260,
        300
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "eccf1dc8-a0bb-40f6-9471-95eac8020b02",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        2060,
        560
      ],
      "parameters": {
        "unit": "seconds"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "Wait": {
      "main": [
        [
          {
            "node": "Split In Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Merge": {
      "main": [
        [
          {
            "node": "Update Google Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI": {
      "main": [
        [
          {
            "node": "Parse JSON",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse JSON": {
      "main": [
        [
          {
            "node": "Merge",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "HTML Extract": {
      "main": [
        [
          {
            "node": "Clean Content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "HTML Extract",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Clean Content": {
      "main": [
        [
          {
            "node": "OpenAI",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split In Batches": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          },
          {
            "node": "Merge",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read Google Sheets": {
      "main": [
        [
          {
            "node": "Split In Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update Google Sheets": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \"Execute Workflow\"": {
      "main": [
        [
          {
            "node": "Read Google Sheets",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Enrich your company lists with OpenAI GPT-3 ↓

Source: https://n8n.io/workflows/1862/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This template can be used to find the content gaps in your competitors' discourse: identifying the topics they are not yet connecting and giving you an opportunity to fill in this gap with your conten

HTTP Request, Html Extract, Google Docs +3
Web Scraping

Wait Code. Uses manualTrigger, httpRequest, htmlExtract, openAi. Event-driven trigger; 11 nodes.

HTTP Request, Html Extract, OpenAI +1
Web Scraping

Googletasks Http. Uses splitInBatches, httpRequest, stickyNote, html. Event-driven trigger; 17 nodes.

HTTP Request, Google Tasks, OpenAI +1
Web Scraping

This workflow automates SEO analysis by comparing your website with a competitor’s site. It reads input URLs from Google Sheets, scrapes structured SEO data from both sites, and expands into important

Google Sheets, HTTP Request, Google Gemini
Web Scraping

This comprehensive workflow automates the entire testimonial collection and publishing process, from submission to social media-ready content. It uses AI to enhance testimonials, generates beautiful b

OpenAI, N8N Nodes Htmlcsstoimage, HTTP Request +3