AutomationFlowsWeb Scraping › Analyze Up to 100 Urls for On-page SEO and Export Results to CSV

Analyze Up to 100 Urls for On-page SEO and Export Results to CSV

BySiddharth Gupta @siddharth on n8n.io

Analyze up to 100 URLs in one run and export key on-page SEO data to CSV automatically.

Chat trigger trigger★★★★★ complexityAI-powered44 nodesHTTP RequestChat TriggerItem ListsChat
Web Scraping Trigger: Chat trigger Nodes: 44 Complexity: ★★★★★ AI nodes: yes Added:

This workflow corresponds to n8n.io template #15756 — we link there as the canonical source.

This workflow follows the Chat → Chat Trigger recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "b02b270d-4d79-4881-ba3e-2973c8e88710",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -320,
        -48
      ],
      "parameters": {
        "width": 736,
        "height": 464,
        "content": "# Workflow Technical Overview\n- **Purpose:** Automated bulk extraction of 27+ On-Page SEO metrics and HTTP header metadata.\n- **Core Constraint:** Hard-coded limit of 100 URLs per chat trigger to ensure stability.\n- **Logic Flow:** Sequential double-loop architecture for validation and extraction.\n\n## Who's it for\nTechnical SEO analysts requiring automated bulk on-page data extraction without external API dependencies.\n\n## How to use\n1. Open the chat trigger interface.\n2. Input a list of target URLs (maximum 100).\n3. Monitor chat notifications for validation alerts and processing status.\n4. Click the output link to download the generated CSV file.\n\n## Setup Requirements\n- Active n8n instance.\n- Outbound network access for HTTP requests."
      },
      "typeVersion": 1
    },
    {
      "id": "d3513989-a85b-42e9-a903-bc9bbbf00b4a",
      "name": "Loop Through URL Batches",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        2480,
        368
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "39883d32-02bf-4a4c-b455-4516fdbfac8a",
      "name": "Fetch Page HTML Payload",
      "type": "n8n-nodes-base.httpRequest",
      "onError": "continueErrorOutput",
      "position": [
        2896,
        384
      ],
      "parameters": {
        "url": "={{ $json.url }}",
        "options": {
          "redirect": {
            "redirect": {
              "followRedirects": false
            }
          },
          "response": {
            "response": {
              "fullResponse": true
            }
          }
        },
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "User-Agent",
              "value": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0"
            }
          ]
        }
      },
      "executeOnce": false,
      "typeVersion": 4.1
    },
    {
      "id": "be5e7bed-7702-463f-85c2-3d3b963cd1b2",
      "name": "Verify HTTP 200 Status",
      "type": "n8n-nodes-base.if",
      "position": [
        3584,
        368
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "f6c29bd9-b913-46c2-af81-e62a46717b0e",
              "operator": {
                "type": "number",
                "operation": "equals"
              },
              "leftValue": "={{ $json.statusCode }}",
              "rightValue": 200
            }
          ]
        }
      },
      "typeVersion": 2.3,
      "alwaysOutputData": false
    },
    {
      "id": "ee2ae600-d0a0-432b-ac87-49e54017afef",
      "name": "Inspect HTML Content Structure",
      "type": "n8n-nodes-base.code",
      "position": [
        3968,
        352
      ],
      "parameters": {
        "jsCode": "return items.map(item => {\n  const html = String(item.json.body || item.json.data || item.json.html || '').trim();\n\n  const hasHtmlLike = /<(html|head|title|body|meta|link|h1)\\b/i.test(html);\n  const hasSeoSignals =\n    /<title\\b[^>]*>[\\s\\S]*?<\\/title>/i.test(html) ||\n    /<meta\\b[^>]+name=[\"']description[\"'][^>]*content=/i.test(html) ||\n    /<link\\b[^>]+rel=[\"']canonical[\"'][^>]*href=/i.test(html) ||\n    /<h1\\b[^>]*>[\\s\\S]*?<\\/h1>/i.test(html);\n\n  const blockedPattern =\n    /javascript is disabled|enable javascript to continue|verify you are human|captcha|access denied|request blocked|temporarily unavailable|sorry, you have been blocked|attention required/i.test(html);\n\n  const clearlyNotHtml =\n    !html ||\n    (!hasHtmlLike && !/<!doctype html/i.test(html));\n\n  let reason = 'ok';\n\n  if (!html) reason = 'empty-body';\n  else if (blockedPattern) reason = 'blocked-or-interstitial';\n  else if (clearlyNotHtml) reason = 'not-html';\n  else if (!hasSeoSignals && html.length < 150) reason = 'thin-non-seo-html';\n\n  item.json.html_compliant = reason === 'ok';\n  item.json.html_compliance_reason = reason;\n\n  return item;\n});"
      },
      "executeOnce": false,
      "typeVersion": 2
    },
    {
      "id": "5d93fec7-9650-4443-8a43-ad10f578fd4d",
      "name": "Validate HTML Compliance Status",
      "type": "n8n-nodes-base.if",
      "position": [
        4192,
        352
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "9724955d-7bee-4eca-a836-e8cb0ee7ecfe",
              "operator": {
                "type": "boolean",
                "operation": "true",
                "singleValue": true
              },
              "leftValue": "={{ $json.html_compliant === true }}",
              "rightValue": "={{   $json.body &&   /<html|<head|<title|<body/i.test($json.body) &&   !/javascript is disabled|enable javascript|not a robot|captcha|access denied|forbidden|temporarily unavailable|verify you are human|cloudflare|attention required/i.test($json.body) }}"
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "cbdaaf6e-65d9-4a0b-85c0-8faa815da49f",
      "name": "Receive Chat Message Input",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "position": [
        512,
        208
      ],
      "parameters": {
        "options": {
          "responseMode": "responseNodes"
        }
      },
      "typeVersion": 1.4
    },
    {
      "id": "fec05fe9-1eb8-4abf-9d29-ca179c5dadd2",
      "name": "Extract and Normalize URLs",
      "type": "n8n-nodes-base.code",
      "position": [
        848,
        208
      ],
      "parameters": {
        "jsCode": "const rawInput = $input.first().json.chatInput || \"\";\nconst rawUrls = rawInput.split(/[\\s,]+/).filter(Boolean);\n\nlet validUrls = [];\nlet skipped = [];\nconst urlRegex = /^(https?:\\/\\/)?([\\da-z\\.-]+)\\.([a-z\\.]{2,6})([\\/\\w \\.-]*)*\\/?$/i;\n\nfor (let url of rawUrls) {\n  if (!urlRegex.test(url)) {\n    skipped.push(url);\n    continue;\n  }\n  let clean = url.toLowerCase();\n  if (!clean.startsWith('http://') && !clean.startsWith('https://')) {\n    clean = 'https://' + clean;\n  }\n  if (!validUrls.includes(clean)) {\n    validUrls.push(clean);\n  }\n}\n\nlet warning = \"\";\nif (validUrls.length > 10000) {\n  warning = \"\u26a0\ufe0f Only the first 10,000 URLs will be processed. Remaining URLs have been skipped.\";\n  validUrls = validUrls.slice(0, 10);\n}\n\nreturn [{\n  json: {\n    valid_urls: validUrls,\n    skipped_input: skipped,\n    warning: warning,\n    total_valid: validUrls.length\n  }\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "ced5ec21-a5ba-473a-8ccd-786504eb0480",
      "name": "Check Empty URL List",
      "type": "n8n-nodes-base.if",
      "position": [
        1104,
        208
      ],
      "parameters": {
        "conditions": {
          "number": [
            {
              "value1": "={{ $json.total_valid }}",
              "operation": "equal"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "87aa1a6e-2eaa-4b54-96f1-769c6740549d",
      "name": "Verify Maximum URL Limit",
      "type": "n8n-nodes-base.if",
      "position": [
        1152,
        384
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "9eb3b4a9-e3ff-4dd6-9c18-bcfb55b7e4ce",
              "operator": {
                "type": "number",
                "operation": "lte"
              },
              "leftValue": "={{ $json.total_valid }}",
              "rightValue": 100
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "e3b349af-4f94-4a3e-85ab-b6a9051c9900",
      "name": "Format URL JSON Objects",
      "type": "n8n-nodes-base.code",
      "position": [
        1520,
        368
      ],
      "parameters": {
        "jsCode": "const urls = $json.valid_urls || [];\n\nreturn [\n  {\n    json: {\n      valid_urls: urls.map((url) => ({\n        url,\n        url_index: url\n      })),\n      skipped_input: $json.skipped_input ?? [],\n      warning: $json.warning ?? '',\n      total_valid: $json.total_valid ?? urls.length\n    }\n  }\n];"
      },
      "typeVersion": 2
    },
    {
      "id": "e0c3e125-5ea5-4c03-aea6-45189f90cc08",
      "name": "Split Into Individual Items",
      "type": "n8n-nodes-base.itemLists",
      "position": [
        1888,
        368
      ],
      "parameters": {
        "options": {
          "destinationFieldName": ""
        },
        "fieldToSplitOut": "valid_urls"
      },
      "typeVersion": 3
    },
    {
      "id": "215d7870-4713-4d7e-9b13-658b48da74bc",
      "name": "Check For Skipped Inputs",
      "type": "n8n-nodes-base.if",
      "position": [
        1488,
        224
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "02a29791-0c3d-484e-8dbf-f895191453ca",
              "operator": {
                "type": "array",
                "operation": "notEmpty",
                "singleValue": true
              },
              "leftValue": "={{ $json.skipped_input }}",
              "rightValue": ""
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "bc285d42-799d-4896-9c21-13bde8d956f5",
      "name": "Send Empty Input Error",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        2112,
        192
      ],
      "parameters": {
        "message": "\u274c No valid URLs were found in your input. Please enter at least one properly formatted URL (e.g. https://seobatter.com) and try again.",
        "options": {},
        "waitUserReply": false
      },
      "typeVersion": 1
    },
    {
      "id": "3245e1f6-03b5-48d0-97f4-ae5cd3d9ee87",
      "name": "Send Maximum Limit Error",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        1392,
        528
      ],
      "parameters": {
        "message": "The valid URLs are more than 100. For preventing system abuse, please enter 100 or lesser URLs.",
        "options": {},
        "waitUserReply": false
      },
      "typeVersion": 1
    },
    {
      "id": "bdff6f86-45f0-4045-b253-88c5bff32722",
      "name": "Send Skipped Input Warning",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        1808,
        208
      ],
      "parameters": {
        "message": "=During initial validation, the some input was excluded from the analysis - Example:\n{{ $json.skipped_input.join('\\n') }}",
        "options": {},
        "waitUserReply": false
      },
      "typeVersion": 1
    },
    {
      "id": "4aa82248-78c7-4622-ba42-de25dfcecd12",
      "name": "Remove Duplicate Records",
      "type": "n8n-nodes-base.removeDuplicates",
      "position": [
        2416,
        -112
      ],
      "parameters": {
        "compare": "selectedFields",
        "options": {},
        "fieldsToCompare": "address"
      },
      "typeVersion": 2
    },
    {
      "id": "305ff240-e54d-46dc-b6ed-cbb348cdec1b",
      "name": "Send Progress Update",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        2624,
        -112
      ],
      "parameters": {
        "message": "Processing data from the valid URLs. Please wait for the complete execution.",
        "options": {},
        "waitUserReply": false
      },
      "typeVersion": 1
    },
    {
      "id": "de866c70-ddfc-462d-86fe-e57de8bcff86",
      "name": "Verify Processed URLs",
      "type": "n8n-nodes-base.if",
      "position": [
        2880,
        -112
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "1477eb10-5580-4e50-b69b-c20db5b24835",
              "operator": {
                "type": "string",
                "operation": "notEmpty",
                "singleValue": true
              },
              "leftValue": "={{ $json.data }}",
              "rightValue": ""
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "ebc293f8-a266-4dc3-8564-f52af846286f",
      "name": "Loop Tag Extraction",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        3216,
        -128
      ],
      "parameters": {
        "options": {}
      },
      "executeOnce": false,
      "typeVersion": 3
    },
    {
      "id": "191a564d-d639-4a58-b7be-aa1d0ae4c7e0",
      "name": "Aggregate Loop Data",
      "type": "n8n-nodes-base.merge",
      "position": [
        3744,
        -16
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combineBy": "combineByPosition"
      },
      "typeVersion": 3.2
    },
    {
      "id": "d9b137a3-99bb-4e30-9ea8-36604eab2488",
      "name": "Format Final Dataset",
      "type": "n8n-nodes-base.set",
      "position": [
        4080,
        -144
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "7dc8b263-bc95-4d2a-b363-a9121dd97bf1",
              "name": "URL",
              "type": "string",
              "value": "={{ Array.isArray($json.address) ? $json.address.join('\\n') : $json.address }}"
            },
            {
              "id": "90b396fc-6f73-48d1-ace5-5ae7e0441032",
              "name": "HTTP Status Code",
              "type": "number",
              "value": "={{ Array.isArray($json.statusCode) ? $json.statusCode.join('\\n') : $json.statusCode }}"
            },
            {
              "id": "41f111b1-3804-4716-b8d8-35e5130233ee",
              "name": "Title",
              "type": "string",
              "value": "={{ Array.isArray($json.title) ? $json.title.join('\\n') : $json.title }}"
            },
            {
              "id": "961e22e2-c48a-4fe6-bd6b-28b9b94e7bc8",
              "name": "Meta Description",
              "type": "string",
              "value": "={{ Array.isArray($json.meta_description) ? $json.meta_description.join('\\n') : $json.meta_description }}"
            },
            {
              "id": "23f4bfc2-f8fa-4cbc-a2f7-4e16088f8ef8",
              "name": "H1",
              "type": "string",
              "value": "={{ Array.isArray($json.h1) ? $json.h1.join('\\n') : $json.h1 }}"
            },
            {
              "id": "ccec095c-2e91-44d8-a4ff-f95d2fae68b5",
              "name": "H2",
              "type": "string",
              "value": "={{ Array.isArray($json.h2) ? $json.h2.join('\\n') : $json.h2 }}"
            },
            {
              "id": "2190d715-1828-48ed-9e69-b64def1d631f",
              "name": "Meta Robots",
              "type": "string",
              "value": "={{ Array.isArray($json.meta_robots) ? $json.meta_robots.join('\\n') : $json.meta_robots }}"
            },
            {
              "id": "e98741e2-eb06-4346-b4ed-f3dcf4f6073e",
              "name": "x_robots_tag",
              "type": "string",
              "value": "={{ Array.isArray($json.x_robots_tag) ? $json.x_robots_tag.join('\\n') : $json.x_robots_tag }}"
            },
            {
              "id": "36812f35-cf7a-4235-9ac3-90ece700b474",
              "name": "meta_charset",
              "type": "string",
              "value": "={{ Array.isArray($json.meta_charset) ? $json.meta_charset.join('\\n') : $json.meta_charset }}"
            },
            {
              "id": "dc2527f1-ce73-4b78-b279-14b5d9bdf08d",
              "name": "meta_viewport",
              "type": "string",
              "value": "={{ Array.isArray($json.meta_viewport) ? $json.meta_viewport.join('\\n') : $json.meta_viewport }}"
            },
            {
              "id": "68d10bb4-a2bc-409a-b406-21a376d51e98",
              "name": "HTML Lang",
              "type": "string",
              "value": "={{ Array.isArray($json.html_lang) ? $json.html_lang.join('\\n') : $json.html_lang }}"
            },
            {
              "id": "d09d61ac-5d58-4f47-8288-c9201ca96d90",
              "name": "Canonical",
              "type": "string",
              "value": "={{ Array.isArray($json.canonical) ? $json.canonical.join('\\n') : $json.canonical }}"
            },
            {
              "id": "659e3283-0ff4-4d3f-9a58-df69db5985cc",
              "name": "Hreflang",
              "type": "string",
              "value": "={{ Array.isArray($json.hreflang) ? $json.hreflang.join('\\n') : $json.hreflang }}"
            },
            {
              "id": "6602787c-6399-4ffc-a845-0960df1b6076",
              "name": "Hreflang URL",
              "type": "string",
              "value": "={{ Array.isArray($json.hreflang_url) ? $json.hreflang_url.join('\\n') : $json.hreflang_url }}"
            },
            {
              "id": "d7b4e6b2-de5e-434a-9783-4aef2c5a082f",
              "name": "OG Title",
              "type": "string",
              "value": "={{ Array.isArray($json.og_title) ? $json.og_title.join('\\n') : $json.og_title }}"
            },
            {
              "id": "84d41988-7261-4ce8-869c-94c01b23b42b",
              "name": "OG Description",
              "type": "string",
              "value": "={{ Array.isArray($json.og_description) ? $json.og_description.join('\\n') : $json.og_description }}"
            },
            {
              "id": "62a48b0d-ee18-40c3-9818-686277b0419a",
              "name": "OG URL",
              "type": "string",
              "value": "={{ Array.isArray($json.og_url) ? $json.og_url.join('\\n') : $json.og_url }}"
            },
            {
              "id": "83bf1fb7-9e3e-4118-a9ed-f44154afe624",
              "name": "OG Type",
              "type": "string",
              "value": "={{ Array.isArray($json.og_type) ? $json.og_type.join('\\n') : $json.og_type }}"
            },
            {
              "id": "c8606b2d-6330-49c8-990c-0da8a5eeffd5",
              "name": "OG Site Name",
              "type": "string",
              "value": "={{ Array.isArray($json.og_site_name) ? $json.og_site_name.join('\\n') : $json.og_site_name }}"
            },
            {
              "id": "29b6e219-b20e-4d66-b600-1f6fd1a8af0a",
              "name": "OG Locale",
              "type": "string",
              "value": "={{ Array.isArray($json.og_locale) ? $json.og_locale.join('\\n') : $json.og_locale }}"
            },
            {
              "id": "e2712bb6-c6c5-4609-8226-c2e02ed82ed1",
              "name": "OG Image",
              "type": "string",
              "value": "={{ Array.isArray($json.og_image) ? $json.og_image.join('\\n') : $json.og_image }}"
            },
            {
              "id": "18e128c6-4b72-4879-bf9d-7d0dc78b58b7",
              "name": "Twitter Card",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_card) ? $json.twitter_card.join('\\n') : $json.twitter_card }}"
            },
            {
              "id": "c8fc32ce-71ff-4823-93ef-4585fe3e81ad",
              "name": "Twitter Title",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_title) ? $json.twitter_title.join('\\n') : $json.twitter_title }}"
            },
            {
              "id": "4e27e6ac-0d32-41de-93a2-3a7a37b7c74a",
              "name": "Twitter Description",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_description) ? $json.twitter_description.join('\\n') : $json.twitter_description }}"
            },
            {
              "id": "48dcc1ce-68e4-4863-8a22-de31a377c0fc",
              "name": "Twitter Image",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_image) ? $json.twitter_image.join('\\n') : $json.twitter_image }}"
            },
            {
              "id": "4062a4b8-70e7-4994-b689-8687532932d3",
              "name": "Twitter Site",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_site) ? $json.twitter_site.join('\\n') : $json.twitter_site }}"
            },
            {
              "id": "fbfa34fc-2c0a-4344-a4b5-bf83e3073c2d",
              "name": "Twitter Creator",
              "type": "string",
              "value": "={{ Array.isArray($json.twitter_creator) ? $json.twitter_creator.join('\\n') : $json.twitter_creator }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "f0b79ba0-8603-4732-a388-efcc598977bb",
      "name": "Generate CSV File",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        4288,
        -144
      ],
      "parameters": {
        "options": {}
      },
      "executeOnce": false,
      "typeVersion": 1.1
    },
    {
      "id": "cfa3fa90-f41f-49fd-843a-dadfac6cc0f5",
      "name": "Upload to Server",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        4480,
        -144
      ],
      "parameters": {
        "url": "https://uguu.se/upload",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "bodyParameters": {
          "parameters": [
            {
              "name": "files[]",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "data"
            }
          ]
        }
      },
      "executeOnce": true,
      "typeVersion": 4,
      "continueOnFail": true
    },
    {
      "id": "689fe512-275e-40ee-ac3d-deeaf7d0a1d9",
      "name": "Extract Download URL",
      "type": "n8n-nodes-base.code",
      "position": [
        4720,
        -144
      ],
      "parameters": {
        "jsCode": "const url = $json.files?.[0]?.url;\nreturn [\n  {\n    json: {\n      message: `<${url}>`\n    }\n  }\n];"
      },
      "typeVersion": 2
    },
    {
      "id": "1f27a501-b0ef-49c3-a190-4195374d4f11",
      "name": "Send Download Link",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        4896,
        -144
      ],
      "parameters": {
        "message": "=The process is complete. Download your SEO On-page audit results here:\n{{ $json.message }}\n",
        "options": {},
        "waitUserReply": false
      },
      "executeOnce": true,
      "typeVersion": 1
    },
    {
      "id": "b37b8114-f866-4dba-8d80-94d251375042",
      "name": "Extract Header Data",
      "type": "n8n-nodes-base.set",
      "position": [
        3568,
        -112
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "a2c79cdd-16a9-4c00-a980-6e98019a8d62",
              "name": "address",
              "type": "string",
              "value": "={{ $json.address }}"
            },
            {
              "id": "3458f23b-c615-499d-b4d0-35656377bfda",
              "name": "x_robots_tag",
              "type": "string",
              "value": "={{$json.headers['x-robots-tag'] || $json.headers['X-Robots-Tag'] || ''}}"
            },
            {
              "id": "923a9675-048c-4e74-9186-31d909d48515",
              "name": "statusCode",
              "type": "number",
              "value": "={{ $json.statusCode }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "86eff2e8-b349-4e9a-85b2-017dfdfaf80a",
      "name": "Send Fetch Error",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        3152,
        496
      ],
      "parameters": {
        "message": "=URL Rejected: {{ $('Split Into Individual Items').item.json.url }}",
        "options": {},
        "waitUserReply": false
      },
      "executeOnce": false,
      "typeVersion": 1
    },
    {
      "id": "350b13db-1c63-4fd0-adbc-851a4755eac2",
      "name": "Map Failed Request Data",
      "type": "n8n-nodes-base.set",
      "position": [
        3376,
        496
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "97c84cac-a76e-40c7-98e8-a6d22964b522",
              "name": "address",
              "type": "string",
              "value": "={{ $json.url }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "fdbb82ed-a9e8-4a4b-8f13-920cfd8583b7",
      "name": "Send Status Code Error",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        3776,
        544
      ],
      "parameters": {
        "message": "=URL Rejected: {{ $('Split Into Individual Items').item.json.url }}",
        "options": {},
        "waitUserReply": false
      },
      "executeOnce": false,
      "typeVersion": 1
    },
    {
      "id": "a00e6911-89d3-40e5-ac0b-925c72946195",
      "name": "Map Status Error Data",
      "type": "n8n-nodes-base.set",
      "position": [
        3968,
        544
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "609c5844-7269-4d86-9d96-8e133c5c15cb",
              "name": "address",
              "type": "string",
              "value": "={{ $('Split Into Individual Items').item.json.url }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "938a1864-5936-4fd4-afa5-1aec47c4734d",
      "name": "Map Compliance Error Data",
      "type": "n8n-nodes-base.set",
      "position": [
        4560,
        576
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "7f0f48f2-2bd0-455b-b68e-3248e021920b",
              "name": "address",
              "type": "string",
              "value": "={{ $('Split Into Individual Items').item.json.url }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "ea14f997-5730-446d-bcca-60e506b12a0d",
      "name": "Format Valid Page Data",
      "type": "n8n-nodes-base.code",
      "position": [
        4896,
        336
      ],
      "parameters": {
        "jsCode": "return items.map(item => ({\n  json: {\n    address: $('Split Into Individual Items').item.json.url,\n    data: item.json.data,\n    headers: item.json.headers,\n    statusCode: item.json.statusCode\n  }\n}));"
      },
      "executeOnce": false,
      "typeVersion": 2
    },
    {
      "id": "c5bde9eb-c205-497a-889d-d77f864fe5d3",
      "name": "Send Success URL Notification",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        5008,
        624
      ],
      "parameters": {
        "message": "=URL Processed: {{ $('Split Into Individual Items').item.json.url }}",
        "options": {},
        "waitUserReply": false
      },
      "executeOnce": false,
      "typeVersion": 1
    },
    {
      "id": "1c991b45-cfd2-4cc2-8a1d-78f2fdd33285",
      "name": "Send Error URL",
      "type": "@n8n/n8n-nodes-langchain.chat",
      "position": [
        4400,
        576
      ],
      "parameters": {
        "message": "=URL Rejected: {{ $('Split Into Individual Items').item.json.url }}",
        "options": {},
        "waitUserReply": false
      },
      "executeOnce": false,
      "typeVersion": 1
    },
    {
      "id": "15c94976-d704-4008-9596-b6655d955cba",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        480,
        80
      ],
      "parameters": {
        "color": 7,
        "width": 1856,
        "height": 608,
        "content": "## 1\ufe0f\u20e3 Phase 1: Input & Validation\nProcesses chat input to extract URLs and enforces the 100-URL limit. Formats the data and splits it into independent items for loop processing."
      },
      "typeVersion": 1
    },
    {
      "id": "8f3cfcaf-11ea-4572-8ea0-31639d3c05aa",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2400,
        224
      ],
      "parameters": {
        "color": 7,
        "width": 2800,
        "height": 704,
        "content": "## 2\ufe0f\u20e3 Phase 2: Fetching & Compliance\nExecutes HTTP GET requests to retrieve raw HTML and response headers. Evaluates status codes and DOM structure to filter bot-protection triggers and map failed requests."
      },
      "typeVersion": 1
    },
    {
      "id": "46d3f7de-9e72-48f9-9bb5-a9795489c2fe",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2352,
        -288
      ],
      "parameters": {
        "color": 7,
        "width": 1616,
        "height": 496,
        "content": "## 3\ufe0f\u20e3 Phase 3: Extraction & Preservation\nParses HTML tags using CSS selectors and maps HTTP response headers. Aggregates the processed items into a single array."
      },
      "typeVersion": 1
    },
    {
      "id": "b3b5072d-808d-4231-8901-825f1618984e",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        4016,
        -272
      ],
      "parameters": {
        "color": 7,
        "width": 1072,
        "height": 320,
        "content": "## 4\ufe0f\u20e3 Phase 4: Compilation & Delivery\nFlattens the dataset to generate a binary CSV file. Uploads the file to an external server and delivers the download URL to the chat interface."
      },
      "typeVersion": 1
    },
    {
      "id": "272bfc44-9634-4156-8bb6-0c478a07f083",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -320,
        448
      ],
      "parameters": {
        "color": 3,
        "width": 736,
        "height": 304,
        "content": "## Workflow Limitations\n\n* **Execution Time:** Processing a 100-URL batch requires approximately 25-30minutes. This duration results from sequential processing designed to avoid rate limits and manage network timeouts.\n* **Sequential Processing:** The workflow uses a batch size of one. This prevents parallel execution and increases the total time required for large lists.\n* **Security Rejections:** Target domains utilizing bot protection (such as Cloudflare or CAPTCHAs) may reject requests despite the compliance audit logic.\n* **Temporary Hosting:** CSV files are uploaded to an external service (Uguu.se). These download links are temporary and the files will be deleted according to the provider's retention policy.\n* **Resource Dependence:** Stability and speed are determined by the local hardware and network connection of the host machine.\n* **Input Ceiling:** There is a hard-coded limit of 100 URLs per execution to prevent system instability."
      },
      "typeVersion": 1
    },
    {
      "id": "82054043-0737-4afb-8a1f-76c0b238fd7e",
      "name": "Extract SEO Tags",
      "type": "n8n-nodes-base.html",
      "position": [
        3424,
        0
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "title",
              "cssSelector": "title"
            },
            {
              "key": "meta_description",
              "attribute": "content",
              "cssSelector": "meta[name=\"description\"]",
              "returnValue": "attribute"
            },
            {
              "key": "meta_robots",
              "attribute": "content",
              "cssSelector": "meta[name=\"robots\"]",
              "returnValue": "attribute"
            },
            {
              "key": "meta_charset",
              "attribute": "charset",
              "cssSelector": "meta[charset]",
              "returnValue": "attribute"
            },
            {
              "key": "meta_viewport",
              "attribute": "content",
              "cssSelector": "meta[name=\"viewport\"]",
              "returnValue": "attribute"
            },
            {
              "key": "html_lang",
              "attribute": "lang",
              "cssSelector": "html",
              "returnValue": "attribute"
            },
            {
              "key": "canonical",
              "attribute": "href",
              "cssSelector": "link[rel=\"canonical\"]",
              "returnValue": "attribute"
            },
            {
              "key": "hreflang",
              "attribute": "hreflang",
              "cssSelector": "link[rel=\"alternate\"][hreflang]",
              "returnArray": true,
              "returnValue": "attribute"
            },
            {
              "key": "hreflang_url",
              "attribute": "href",
              "cssSelector": "link[rel=\"alternate\"][hreflang]",
              "returnArray": true,
              "returnValue": "attribute"
            },
            {
              "key": "og_title",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:title\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_description",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:description\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_url",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:url\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_type",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:type\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_site_name",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:site_name\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_locale",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:locale\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_image",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:image\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_image_alt",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:image:alt\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_image_width",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:image:width\"]",
              "returnValue": "attribute"
            },
            {
              "key": "og_image_height",
              "attribute": "content",
              "cssSelector": "meta[property=\"og:image:height\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_card",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:card\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_title",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:title\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_description",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:description\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_image",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:image\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_site",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:site\"]",
              "returnValue": "attribute"
            },
            {
              "key": "twitter_creator",
              "attribute": "content",
              "cssSelector": "meta[name=\"twitter:creator\"]",
              "returnValue": "attribute"
            },
            {
              "key": "h1",
              "cssSelector": "h1",
              "returnArray": true
            },
            {
              "key": "h2",
              "cssSelector": "h2",
              "returnArray": true
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "187984ae-d4cc-4519-9395-83d5903100e1",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2448,
        352
      ],
      "parameters": {
        "width": 592,
        "height": 256,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\nManages single-batch iterations for HTTP GET requests to prevent rate-limit blocks and server timeouts during bulk URL fetching."
      },
      "typeVersion": 1
    },
    {
      "id": "84ab50e6-d13b-45bc-8691-fcaa33ea7c8b",
      "name": "Sticky Note7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3536,
        320
      ],
      "parameters": {
        "width": 1232,
        "height": 576,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n**Response Routing & Error Mapping**\nEvaluates HTTP status codes and HTML DOM structure to filter bot-protection walls. Routes and maps failed requests."
      },
      "typeVersion": 1
    },
    {
      "id": "e7467fd2-3725-4dcc-9436-50efe2e91562",
      "name": "Sticky Note8",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        3152,
        -192
      ],
      "parameters": {
        "width": 752,
        "height": 368,
        "content": "**DOM Extraction & Data Aggregation**\nIterates over valid HTML payloads to parse CSS selectors, maps preserved HTTP headers, and aggregates items."
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "Send Error URL": {
      "main": [
        [
          {
            "node": "Map Compliance Error Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract SEO Tags": {
      "main": [
        [
          {
            "node": "Aggregate Loop Data",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "Send Fetch Error": {
      "main": [
        [
          {
            "node": "Map Failed Request Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Upload to Server": {
      "main": [
        [
          {
            "node": "Extract Download URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Generate CSV File": {
      "main": [
        [
          {
            "node": "Upload to Server",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate Loop Data": {
      "main": [
        [
          {
            "node": "Loop Tag Extraction",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Header Data": {
      "main": [
        [
          {
            "node": "Aggregate Loop Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Tag Extraction": {
      "main": [
        [
          {
            "node": "Format Final Dataset",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Extract Header Data",
            "type": "main",
            "index": 0
          },
          {
            "node": "Extract SEO Tags",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Empty URL List": {
      "main": [
        [
          {
            "node": "Send Empty Input Error",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Check For Skipped Inputs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Download URL": {
      "main": [
        [
          {
            "node": "Send Download Link",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format Final Dataset": {
      "main": [
        [
          {
            "node": "Generate CSV File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send Progress Update": {
      "main": [
        [
          {
            "node": "Verify Processed URLs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Map Status Error Data": {
      "main": [
        [
          {
            "node": "Loop Through URL Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Verify Processed URLs": {
      "main": [
        [
          {
            "node": "Loop Tag Extraction",
            "type": "main",
            "index": 0
          }
        ],
        []
      ]
    },
    "Format Valid Page Data": {
      "main": [
        [
          {
            "node": "Send Success URL Notification",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send Status Code Error": {
      "main": [
        [
          {
            "node": "Map Status Error Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Verify HTTP 200 Status": {
      "main": [
        [
          {
            "node": "Inspect HTML Content Structure",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Send Status Code Error",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch Page HTML Payload": {
      "main": [
        [
          {
            "node": "Verify HTTP 200 Status",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Send Fetch Error",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format URL JSON Objects": {
      "main": [
        [
          {
            "node": "Split Into Individual Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Map Failed Request Data": {
      "main": [
        [
          {
            "node": "Loop Through URL Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check For Skipped Inputs": {
      "main": [
        [
          {
            "node": "Send Skipped Input Warning",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Through URL Batches": {
      "main": [
        [
          {
            "node": "Remove Duplicate Records",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Fetch Page HTML Payload",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Remove Duplicate Records": {
      "main": [
        [
          {
            "node": "Send Progress Update",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Verify Maximum URL Limit": {
      "main": [
        [
          {
            "node": "Format URL JSON Objects",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Send Maximum Limit Error",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Map Compliance Error Data": {
      "main": [
        [
          {
            "node": "Loop Through URL Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract and Normalize URLs": {
      "main": [
        [
          {
            "node": "Check Empty URL List",
            "type": "main",
            "index": 0
          },
          {
            "node": "Verify Maximum URL Limit",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Receive Chat Message Input": {
      "main": [
        [
          {
            "node": "Extract and Normalize URLs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Into Individual Items": {
      "main": [
        [
          {
            "node": "Loop Through URL Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send Success URL Notification": {
      "main": [
        [
          {
            "node": "Loop Through URL Batches",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Inspect HTML Content Structure": {
      "main": [
        [
          {
            "node": "Validate HTML Compliance Status",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Validate HTML Compliance Status": {
      "main": [
        [
          {
            "node": "Format Valid Page Data",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Send Error URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Analyze up to 100 URLs in one run and export key on-page SEO data to CSV automatically.

Source: https://n8n.io/workflows/15756/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

Use cases are many: Compare costs across different models, plan your AI budget, optimize prompts for cost efficiency, or track expenses for client billing! OpenRouter charges a platform fee on top of

Chat Trigger, Execute Workflow Trigger, Form +4
Web Scraping

o1 - mini. Uses redis, httpRequest, chatTrigger. Chat trigger; 26 nodes.

Redis, HTTP Request, Chat Trigger
Web Scraping

pfe-hunter-sophistique. Uses lmChatGroq, chatTrigger, chainLlm, httpRequest. Chat trigger; 24 nodes.

Groq Chat, Chat Trigger, Chain Llm +1
Web Scraping

o1 - groq. Uses redis, httpRequest, chatTrigger. Chat trigger; 22 nodes.

Redis, HTTP Request, Chat Trigger
Web Scraping

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Chat Trigger, HTTP Request, N8N Nodes Aimlapi +1