{
  "id": "n2iZmshLmcXubEpo",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Extract Website URLs from Sitemap.XML for SEO Analysis",
  "tags": [
    {
      "id": "MePRktFsL1ttwWdT",
      "name": "website",
      "createdAt": "2025-05-12T18:47:34.764Z",
      "updatedAt": "2025-05-12T18:47:34.764Z"
    },
    {
      "id": "xutdortHHmV1yNZB",
      "name": "SEO",
      "createdAt": "2025-03-24T16:18:45.828Z",
      "updatedAt": "2025-03-24T16:18:45.828Z"
    }
  ],
  "nodes": [
    {
      "id": "6d91a84e-bf2b-4118-9e35-5baecda1b14b",
      "name": "XML",
      "type": "n8n-nodes-base.xml",
      "position": [
        340,
        -40
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "65f12b51-d34c-4e87-b581-e29370eb0554",
      "name": "When clicking \u2018Test workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -320,
        -40
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "b82a0bce-0dd4-4a64-b60a-64ea4021bee5",
      "name": "Split Out",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        560,
        -40
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "sitemapindex.sitemap"
      },
      "typeVersion": 1
    },
    {
      "id": "0e97abde-ba13-4889-979d-0f0e5b085dcb",
      "name": "Set URL",
      "type": "n8n-nodes-base.set",
      "notes": "Set full URL - not domain",
      "position": [
        -100,
        -40
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "fa078c97-4c7c-4c08-a011-5527661997c6",
              "name": "Domain",
              "type": "string",
              "value": "https://phu.io.vn/"
            }
          ]
        }
      },
      "notesInFlow": true,
      "typeVersion": 3.4
    },
    {
      "id": "146a5e34-d64a-450b-8354-770c90547325",
      "name": "Convert to File",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        1440,
        -40
      ],
      "parameters": {
        "options": {},
        "binaryPropertyName": "={{ $json.loc }}"
      },
      "typeVersion": 1.1
    },
    {
      "id": "0c1f8d18-ca4b-4996-9928-abbc6d45b227",
      "name": "Crawl sitemap",
      "type": "n8n-nodes-base.httpRequest",
      "notes": "or past sitemap URL at here",
      "position": [
        120,
        -40
      ],
      "parameters": {
        "url": "={{ $json.Domain }}sitemap.xml",
        "options": {
          "timeout": 10000
        },
        "responseFormat": "string"
      },
      "notesInFlow": true,
      "typeVersion": 1
    },
    {
      "id": "eaa43363-d059-4c66-8851-7e85d4fb5bd3",
      "name": "Crawl sitemap 2",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        780,
        -40
      ],
      "parameters": {
        "url": "={{ $json.loc }}",
        "options": {
          "timeout": 10000
        },
        "responseFormat": "string"
      },
      "typeVersion": 1
    },
    {
      "id": "692efb13-a6ce-4667-842b-614cf9ee8315",
      "name": "XML 2",
      "type": "n8n-nodes-base.xml",
      "position": [
        1000,
        -40
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "88a15568-352f-4997-ae2b-522a2713843d",
      "name": "Split Out 2",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        1220,
        -40
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "urlset.url"
      },
      "typeVersion": 1
    },
    {
      "id": "e0e0fb12-1dc5-4665-9530-6b53ed7dc593",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -160,
        -240
      ],
      "parameters": {
        "width": 440,
        "height": 360,
        "content": "## Set website URL at node 1 (or paste sitemap URL at node 2)"
      },
      "typeVersion": 1
    },
    {
      "id": "fc9d88ce-4ec0-4581-b1ff-7c007bdf5f0b",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1340,
        -200
      ],
      "parameters": {
        "color": 4,
        "width": 300,
        "height": 320,
        "content": "## Download file at here\n(or replace this node = Gooogle sheet node)\n"
      },
      "typeVersion": 1
    },
    {
      "id": "39f13360-e94d-4b33-b258-5e8837daab4f",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -980,
        -460
      ],
      "parameters": {
        "color": 6,
        "width": 600,
        "height": 940,
        "content": "# FAQ\n## Q: What happens if the sitemap is large or contains many sub-sitemaps?\n\nA: The workflow handles sitemap indexes by splitting and processing each sub-sitemap individually. For very large sitemaps, ensure your n8n instance has sufficient resources (memory and CPU) to avoid performance issues. See Scaling n8n for optimization tips.\n\n## Q: Can I use this workflow with a specific sitemap URL instead of a domain?\n\nA: Yes, in the Crawl sitemap node, replace the url parameter ({{ $json.Domain }}sitemap.xml) with the direct sitemap URL (e.g., https://example.com/sitemap.xml). Update the node\u2019s notes for clarity.\n\n## Q: Why am I getting a timeout error?\n\nA: The HTTP Request nodes have a default timeout of 10 seconds. If the target server is slow, increase the timeout value in the options parameter of the Crawl sitemap or Crawl sitemap 2 nodes.\n\n## Q: How can I save the URLs to Google Sheets instead of a file?\n\nA: Replace the Convert to File node with a Google Sheets node. Configure it with your Google Sheets credentials and map the loc field from the Split Out 2 node to the desired spreadsheet column. Refer to the Google Sheets node documentation.\n\n## Q: Is this workflow compatible with older n8n versions?\n\nA: The workflow uses nodes compatible with n8n version 1.0 and later. For older versions, check for deprecated features (e.g., MySQL support) in the n8n v1.0 migration guide."
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "d2ef17d5-a482-4b0d-b48a-83d5bd146b9f",
  "connections": {
    "XML": {
      "main": [
        [
          {
            "node": "Split Out",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "XML 2": {
      "main": [
        [
          {
            "node": "Split Out 2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set URL": {
      "main": [
        [
          {
            "node": "Crawl sitemap",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out": {
      "main": [
        [
          {
            "node": "Crawl sitemap 2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out 2": {
      "main": [
        [
          {
            "node": "Convert to File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Crawl sitemap": {
      "main": [
        [
          {
            "node": "XML",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Convert to File": {
      "main": [
        []
      ]
    },
    "Crawl sitemap 2": {
      "main": [
        [
          {
            "node": "XML 2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \u2018Test workflow\u2019": {
      "main": [
        [
          {
            "node": "Set URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}