AutomationFlowsWeb Scraping › Extract Website Intelligence & Classify Ecommerce Urls with Gemini &…

Extract Website Intelligence & Classify Ecommerce Urls with Gemini &…

Original n8n title: Extract Website Intelligence & Classify Ecommerce Urls with Gemini & Firecrawl to Google Sheets

ByDinakar Selvakumar @jamesdinakar on n8n.io

This n8n template automates website analysis and ecommerce URL classification using AI. It scrapes a website, extracts business intelligence, maps all internal pages, and categorises them into products, categories, or non-commerce pages. All outputs are saved in Google Sheets…

Event trigger★★★★☆ complexityAI-powered28 nodes@Mendable/N8N Nodes FirecrawlForm TriggerHTTP RequestHtml ExtractGoogle SheetsGoogle Gemini
Web Scraping Trigger: Event Nodes: 28 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #12132 — we link there as the canonical source.

This workflow follows the Form Trigger → Googlegemini recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "mGUpZROsuMn6IyJo",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "7 AI Website Intelligence & Ecommerce URL Classifier",
  "tags": [],
  "nodes": [
    {
      "id": "a2bb1dc9-0b7d-4103-b211-a14ba4fd6f21",
      "name": "Map a website and get urls",
      "type": "@mendable/n8n-nodes-firecrawl.firecrawl",
      "position": [
        496,
        1568
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "4520ea76-50f0-4c71-a8e1-7225819c55c6",
      "name": "Form Submission",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        -1056,
        1568
      ],
      "parameters": {},
      "typeVersion": 2.3
    },
    {
      "id": "d7fbb4f6-0bb8-4a18-9cdb-ec81f5ddd9f7",
      "name": "Scrape Website URL",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -832,
        1568
      ],
      "parameters": {},
      "retryOnFail": true,
      "typeVersion": 3,
      "continueOnFail": true
    },
    {
      "id": "d209b22e-cc82-417c-b6ad-bcdfa583aa7a",
      "name": "Extract HTML",
      "type": "n8n-nodes-base.htmlExtract",
      "position": [
        -608,
        1568
      ],
      "parameters": {},
      "typeVersion": 1,
      "continueOnFail": true
    },
    {
      "id": "c82bee59-ecb5-4175-8ede-5a02950b3795",
      "name": "Clean HTML Content",
      "type": "n8n-nodes-base.code",
      "position": [
        -384,
        1568
      ],
      "parameters": {},
      "executeOnce": false,
      "typeVersion": 1,
      "continueOnFail": true,
      "alwaysOutputData": false
    },
    {
      "id": "b94ebd34-a194-4b5b-8456-8bedf865fa93",
      "name": "Parse JSON Data",
      "type": "n8n-nodes-base.code",
      "position": [
        80,
        1568
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "45ec2485-712c-4945-9c46-25bfd02cdf3c",
      "name": "Update Domain Scraper Sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        272,
        1568
      ],
      "parameters": {},
      "typeVersion": 3
    },
    {
      "id": "b897efbc-9c1a-418d-9f87-ac950115f554",
      "name": "Parse URLs with MetaData",
      "type": "n8n-nodes-base.code",
      "position": [
        720,
        1568
      ],
      "parameters": {},
      "typeVersion": 2
    },
    {
      "id": "79a44d25-83d3-4218-82c5-5990e2b42215",
      "name": "Parse Array URLs",
      "type": "n8n-nodes-base.code",
      "position": [
        944,
        1568
      ],
      "parameters": {},
      "typeVersion": 2
    },
    {
      "id": "beeacff3-2fe6-48d6-a191-c98fd6d37c53",
      "name": "Split in Batches",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        1168,
        1568
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "a4c436f7-1b78-4036-b345-5b41b125bf56",
      "name": "Categorising AI Agent",
      "type": "@n8n/n8n-nodes-langchain.googleGemini",
      "position": [
        -1072,
        880
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "74d817e2-39f3-4a0f-b074-52024d5a0362",
      "name": "Company Info Agent",
      "type": "@n8n/n8n-nodes-langchain.googleGemini",
      "position": [
        -208,
        1568
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "9e17db73-71ef-4a6d-badf-e58209f7ea1a",
      "name": "Parse All URLs with categories",
      "type": "n8n-nodes-base.code",
      "position": [
        -720,
        880
      ],
      "parameters": {},
      "typeVersion": 2,
      "alwaysOutputData": false
    },
    {
      "id": "b71f540b-0dd9-4952-84b0-9f4a5ba2a533",
      "name": "Append Categories",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        -208,
        944
      ],
      "parameters": {},
      "typeVersion": 4.7
    },
    {
      "id": "c7723efc-60ff-46b9-b49f-7fde98f4514e",
      "name": "Append Products",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        -208,
        1136
      ],
      "parameters": {},
      "typeVersion": 4.7
    },
    {
      "id": "f37415ef-2f1e-449f-a577-dc13f8dbeb2a",
      "name": "Parse Others",
      "type": "n8n-nodes-base.code",
      "position": [
        -496,
        1072
      ],
      "parameters": {},
      "typeVersion": 2
    },
    {
      "id": "a7a559b9-885a-4895-9316-5bf2e5676c05",
      "name": "Parse Products",
      "type": "n8n-nodes-base.code",
      "position": [
        -496,
        880
      ],
      "parameters": {},
      "typeVersion": 2
    },
    {
      "id": "aac0eb29-af46-4097-a6b8-c4125a6f660f",
      "name": "Parse Categories",
      "type": "n8n-nodes-base.code",
      "position": [
        -496,
        688
      ],
      "parameters": {},
      "typeVersion": 2
    },
    {
      "id": "b66ef363-2bad-4fad-abfc-a61d8beaecd7",
      "name": "Append Others",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        -208,
        1328
      ],
      "parameters": {},
      "typeVersion": 4.7
    },
    {
      "id": "c1cb2159-7903-45c1-9bba-a97b9df2e159",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1856,
        752
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "40362f69-2d80-44c3-a0b7-9d3c31eb6bd5",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1056,
        1440
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "3b01b935-c240-47a2-9011-4545cf7cdfce",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -608,
        1760
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "fdca9bad-1fea-44ee-8dd3-6a5b2ec31ec9",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -208,
        1760
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "76231bc0-29df-4c46-a715-865d1660b30e",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        464,
        1760
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "f5fc0dd2-e6d0-4aaa-bc4d-553ecd39749b",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        928,
        1760
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "b58852cb-2474-4814-b06e-0bff28675bdd",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1104,
        720
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "739d4c7a-1f82-4d2a-acbc-311da04d7683",
      "name": "Sticky Note7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -576,
        528
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    },
    {
      "id": "4eb4e8bd-7db2-4d58-b11b-3bc8f8c47576",
      "name": "Sticky Note8",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -256,
        720
      ],
      "parameters": {
        "content": ""
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "755a2fe2-1ce6-46f1-bf92-fbfc21788406",
  "connections": {
    "Extract HTML": {
      "main": [
        [
          {
            "node": "Clean HTML Content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse Others": {
      "main": [
        [
          {
            "node": "Append Others",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Append Others": {
      "main": [
        [
          {
            "node": "Split in Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse Products": {
      "main": [
        [
          {
            "node": "Append Products",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Append Products": {
      "main": [
        [
          {
            "node": "Split in Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Form Submission": {
      "main": [
        [
          {
            "node": "Scrape Website URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse JSON Data": {
      "main": [
        [
          {
            "node": "Update Domain Scraper Sheet",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse Array URLs": {
      "main": [
        [
          {
            "node": "Split in Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse Categories": {
      "main": [
        [
          {
            "node": "Append Categories",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split in Batches": {
      "main": [
        [
          {
            "node": "Categorising AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Append Categories": {
      "main": [
        [
          {
            "node": "Split in Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Clean HTML Content": {
      "main": [
        [
          {
            "node": "Company Info Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Company Info Agent": {
      "main": [
        [
          {
            "node": "Parse JSON Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Scrape Website URL": {
      "main": [
        [
          {
            "node": "Extract HTML",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Categorising AI Agent": {
      "main": [
        [
          {
            "node": "Parse All URLs with categories",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse URLs with MetaData": {
      "main": [
        [
          {
            "node": "Parse Array URLs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Map a website and get urls": {
      "main": [
        [
          {
            "node": "Parse URLs with MetaData",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update Domain Scraper Sheet": {
      "main": [
        [
          {
            "node": "Map a website and get urls",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse All URLs with categories": {
      "main": [
        [
          {
            "node": "Parse Categories",
            "type": "main",
            "index": 0
          },
          {
            "node": "Parse Products",
            "type": "main",
            "index": 0
          },
          {
            "node": "Parse Others",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This n8n template automates website analysis and ecommerce URL classification using AI. It scrapes a website, extracts business intelligence, maps all internal pages, and categorises them into products, categories, or non-commerce pages. All outputs are saved in Google Sheets…

Source: https://n8n.io/workflows/12132/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This n8n template helps recruitment agencies discover active job openings, filter them based on hiring relevance, and qualify them using AI — specifically designed for semi-skilled manpower hiring use

Form Trigger, Google Sheets, OpenAI +2
Web Scraping

This workflow is Part 2 of the HR Client Acquisition system and builds on the lead discovery pipeline from the previous workflow:

Google Sheets, HTTP Request, OpenAI +2
Web Scraping

End-to-end lead pipeline (discovery → enrichment → outreach) Google Search–based LinkedIn discovery (safe approach) Batch processing with controlled loops AI-generated cold emails and follow-ups Googl

HTTP Request, Google Sheets, Google Gemini +1
Web Scraping

Product - SERP Analysis (Serper + Firecrawl). Uses formTrigger, httpRequest, googleSheets, openAi. Event-driven trigger; 40 nodes.

Form Trigger, HTTP Request, Google Sheets +1
Web Scraping

Product - SERP Analysis (Serper & Crawl4AI). Uses formTrigger, httpRequest, googleSheets, openAi. Event-driven trigger; 39 nodes.

Form Trigger, HTTP Request, Google Sheets +1