AutomationFlowsGeneral › Sitemap Crawler and Parser

Sitemap Crawler and Parser

Original n8n title: Stopanderror Wait

Stopanderror Wait. Uses manualTrigger, scheduleTrigger, splitInBatches, httpRequest. Event-driven trigger; 12 nodes.

Event trigger★★★★☆ complexity12 nodesHTTP RequestXMLStop And Error
General Trigger: Event Nodes: 12 Complexity: ★★★★☆ Added:

This workflow follows the HTTP Request → Stopanderror recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "nodes": [
    {
      "id": "0788a3db-20c3-43b6-956a-394f688f7763",
      "name": "When clicking \"Execute Workflow\"",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        360,
        440
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "51460fab-a53c-46cd-a484-d2c038cd102d",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        360,
        600
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "triggerAtHour": 1
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "5326416c-5715-4cc7-acfd-38a32f864bfb",
      "name": "loop",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        1360,
        600
      ],
      "parameters": {
        "options": {},
        "batchSize": 1
      },
      "typeVersion": 2
    },
    {
      "id": "fb0ca9f7-ff49-4a4b-9575-42b80594737e",
      "name": "sitemap_set",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        540,
        600
      ],
      "parameters": {
        "url": "https://bushidogym.fr/sitemap.xml",
        "options": {}
      },
      "typeVersion": 4.1
    },
    {
      "id": "150b47fe-f1c8-4dcb-b187-b459ee50c316",
      "name": "sitemap_convert",
      "type": "n8n-nodes-base.xml",
      "position": [
        700,
        600
      ],
      "parameters": {
        "options": {
          "trim": true,
          "normalize": true,
          "mergeAttrs": true,
          "ignoreAttrs": true,
          "normalizeTags": true
        }
      },
      "typeVersion": 1
    },
    {
      "id": "83cd19d6-81e7-46af-83a3-090cdd66b420",
      "name": "sitemap_parse",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        920,
        600
      ],
      "parameters": {
        "options": {
          "destinationFieldName": "url"
        },
        "fieldToSplitOut": "urlset.url"
      },
      "typeVersion": 1
    },
    {
      "id": "95c784d1-5756-4bf0-b2e5-e25a84c01b72",
      "name": "url_set",
      "type": "n8n-nodes-base.set",
      "position": [
        1140,
        600
      ],
      "parameters": {
        "values": {
          "string": [
            {
              "name": "url",
              "value": "={{ $json.url.loc }}"
            }
          ]
        },
        "options": {},
        "keepOnlySet": true
      },
      "typeVersion": 2
    },
    {
      "id": "43b62667-a37e-4bd1-bbb9-7a20a0914c97",
      "name": "url_index",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1560,
        580
      ],
      "parameters": {
        "url": "https://indexing.googleapis.com/v3/urlNotifications:publish",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "authentication": "predefinedCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "url",
              "value": "={{ $json.url }}"
            },
            {
              "name": "type",
              "value": "URL_UPDATED"
            }
          ]
        },
        "nodeCredentialType": "googleApi"
      },
      "credentials": {
        "googleApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4,
      "continueOnFail": true,
      "alwaysOutputData": true
    },
    {
      "id": "39ae8c01-64e4-44f5-be43-d5c402b00739",
      "name": "index_check",
      "type": "n8n-nodes-base.if",
      "position": [
        1780,
        580
      ],
      "parameters": {
        "conditions": {
          "string": [
            {
              "value1": "={{ $json.urlNotificationMetadata.latestUpdate.type }}",
              "value2": "URL_UPDATED"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "c4bf483b-af4b-451e-974b-d4abeb2c70f6",
      "name": "wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        2040,
        560
      ],
      "parameters": {
        "unit": "seconds",
        "amount": 2
      },
      "typeVersion": 1
    },
    {
      "id": "455955a8-c767-453b-805c-77c5b7d2e9bc",
      "name": "Stop and Error",
      "type": "n8n-nodes-base.stopAndError",
      "position": [
        2040,
        840
      ],
      "parameters": {
        "errorMessage": "You have reached the Google Indexing API limit (200/day by default)"
      },
      "typeVersion": 1
    },
    {
      "id": "275abdd5-be5d-458f-bc75-d9f72824c49f",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        340,
        180
      ],
      "parameters": {
        "width": 482.7089688834655,
        "height": 221.39109212934721,
        "content": "## Simple indexing workflow using the Google Indexing API\n\nThis workflow is the simplest indexing workflow. It simply extracts a sitemap, converts it to a JSON, and loops through each URL. It will output an error if your quota is reached.\n\n*Joachim*"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "loop": {
      "main": [
        [
          {
            "node": "url_index",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "wait": {
      "main": [
        [
          {
            "node": "loop",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "url_set": {
      "main": [
        [
          {
            "node": "loop",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "url_index": {
      "main": [
        [
          {
            "node": "index_check",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "index_check": {
      "main": [
        [
          {
            "node": "wait",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Stop and Error",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "sitemap_set": {
      "main": [
        [
          {
            "node": "sitemap_convert",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "sitemap_parse": {
      "main": [
        [
          {
            "node": "url_set",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "sitemap_convert": {
      "main": [
        [
          {
            "node": "sitemap_parse",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Schedule Trigger": {
      "main": [
        [
          {
            "node": "sitemap_set",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \"Execute Workflow\"": {
      "main": [
        [
          {
            "node": "sitemap_set",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

How this works

This workflow automates the extraction and processing of website sitemaps to build a comprehensive list of URLs, saving you hours of manual crawling and ensuring you capture every page without missing links. It's ideal for SEO specialists, content auditors, or developers needing to analyse site structures for optimisation or migration tasks. The key step involves fetching the sitemap via HTTP request, converting the XML data, and parsing it into individual URLs using n8n's split nodes, with built-in error handling to halt execution if issues arise.

Use this workflow when dealing with large sites requiring periodic sitemap refreshes, such as monthly SEO reports, or for one-off audits before redesigns. Avoid it for dynamic sites without static sitemaps, where custom scraping tools like Puppeteer might be better. Common variations include adding filters to exclude certain URL patterns or integrating with Google Sheets to export the results directly.

About this workflow

Stopanderror Wait. Uses manualTrigger, scheduleTrigger, splitInBatches, httpRequest. Event-driven trigger; 12 nodes.

Source: https://github.com/Zie619/n8n-workflows — original creator credit. Request a take-down →

More General workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

General

Generate Leads with Google Maps - AlexK1919. Uses manualTrigger, scheduleTrigger, executeWorkflowTrigger, stopAndError. Event-driven trigger; 42 nodes.

Execute Workflow Trigger, Stop And Error, HTTP Request +1
General

Stopanderror Awss3. Uses manualTrigger, splitOut, stopAndError, httpRequest. Event-driven trigger; 17 nodes.

Stop And Error, HTTP Request, AWS S3
General

💡🌐 Essential Multipage Website Scraper with Jina.ai. Uses stickyNote, manualTrigger, splitInBatches, limit. Event-driven trigger; 16 nodes.

HTTP Request, XML, Google Drive
General

airflow dag_run. Uses httpRequest, stopAndError, executeWorkflowTrigger. Event-driven trigger; 12 nodes.

HTTP Request, Stop And Error, Execute Workflow Trigger
General

Use XMLRPC via HttpRequest-node to post on Wordpress.com. Uses manualTrigger, stickyNote, noOp, httpRequest. Event-driven trigger; 11 nodes.

HTTP Request, XML