AutomationFlowsGeneral › CSV Bulk Validator & Sanitizer

CSV Bulk Validator & Sanitizer

Original n8n title: CSV Bulk Validator + Sanitizer

CSV Bulk Validator + Sanitizer. Uses stickyNote, httpRequest, respondToWebhook. Webhook trigger; 16 nodes.

Webhook trigger★★★★☆ complexity16 nodesHTTP Request
General Trigger: Webhook Nodes: 16 Complexity: ★★★★☆ Added:

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "CSV Bulk Validator + Sanitizer",
  "nodes": [
    {
      "parameters": {
        "content": "## CSV Bulk Validator + Sanitizer\n\nWebhook accepts a CSV (raw text/csv or multipart upload). Validates each row against `VALIDATION_SCHEMA` env, sanitizes strings, separates valid from invalid, returns a structured per-row report.\n\n**Production patterns wired:**\n- HMAC verify on `x-csv-signature` (opt-in, `CSV_UPLOAD_SIGNING_SECRET`)\n- Rate limit (opt-in, `RATE_LIMIT_ENABLED=1`)\n- Idempotency on `sha256(rawBody)` (opt-in, `IDEMPOTENCY_ENABLED=1`)\n- Error branch with structured fallback\n- Hard `MAX_BODY_BYTES` cap inside the parser\n\nSee `README.md` for setup, env vars, and extension recipes.",
        "height": 320,
        "width": 400,
        "color": 6
      },
      "id": "note-intro",
      "name": "Sticky Note - Intro",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        -200,
        -100
      ]
    },
    {
      "parameters": {
        "content": "### >> SET ME <<\n\n1. Set `VALIDATION_SCHEMA` to a JSON object describing required columns. See README for example.\n2. Optional: `CSV_DELIMITER=,` (default), `MAX_ROWS=10000`, `MAX_BODY_BYTES=5242880` (5MB).\n3. Optional: `CSV_UPLOAD_SIGNING_SECRET=<random-32-chars>` for HMAC verification, then your client sends `x-csv-signature: <hmac-sha256-hex(rawBody)>`.\n4. Optional: `SLACK_OPS_WEBHOOK` for batch summary alerts.\n5. Self-hosted n8n: set `NODE_FUNCTION_ALLOW_BUILTIN=crypto`.",
        "height": 320,
        "width": 380,
        "color": 5
      },
      "id": "note-setup",
      "name": "Sticky Note - Setup",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        -200,
        240
      ]
    },
    {
      "parameters": {
        "content": "## Production Patterns\n\nFour patterns wired as opt-in nodes. Default-off via env vars so import boots clean.\n\n- **HMAC verify:** `CSV_UPLOAD_SIGNING_SECRET` + `WEBHOOK_INTEGRITY_CHECK_ENABLED=1` (header `x-csv-signature: <hmac-sha256-hex>`)\n- **Rate limit:** `RATE_LIMIT_ENABLED=1` (60 req / 5 min / IP)\n- **Idempotency:** `IDEMPOTENCY_ENABLED=1` (5-min window on `sha256(rawBody)`)\n- **Error branch:** always on. Slack alert + structured fallback response so the client never sees a hung connection.\n\n- **Respond-duplicate gateway:** when the Idempotency Check detects a duplicate it emits a `{ skipped: true }` sentinel that the `Skip If Duplicate` IF node routes to the dedicated `Respond Duplicate` `respondToWebhook` node (200 OK + `{ ok: true, deduped: true }`). This avoids the 30s connection-hang that would otherwise occur on `responseMode: responseNode` if the duplicate path returned `[]` and never reached a respond node.\n\nFor clustered n8n, swap the in-memory dedup for Redis SET NX EX 300. Snippet in the node's comments.",
        "height": 320,
        "width": 380,
        "color": 7
      },
      "id": "note-production-patterns",
      "name": "Sticky Note - Production Patterns",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        840,
        -260
      ]
    },
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "csv-validate",
        "responseMode": "responseNode",
        "options": {
          "rawBody": true
        }
      },
      "id": "csv-1-trigger",
      "name": "CSV Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2,
      "position": [
        240,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// HMAC-SHA256 signature verification on raw body, opt-in, with replay-window check.\n// Headers required when WEBHOOK_INTEGRITY_CHECK_ENABLED=1:\n//   x-csv-signature: <hex-hmac-sha256(timestamp + '.' + rawBody)>\n//   x-csv-timestamp: <unix-seconds>\n// Enable by setting CSV_UPLOAD_SIGNING_SECRET + WEBHOOK_INTEGRITY_CHECK_ENABLED=1.\n//\n// Replay-window default 5 min. Override with CSV_UPLOAD_REPLAY_WINDOW_S.\n\nconst secret = $env.CSV_UPLOAD_SIGNING_SECRET;\nconst integrityCheckEnabled = $env.WEBHOOK_INTEGRITY_CHECK_ENABLED === '1';\n\nif (!secret || !integrityCheckEnabled) {\n  return [$input.first()];\n}\n\nconst crypto = require('crypto');\nconst item = $input.first();\nconst rawBody = item.json.rawBody || (typeof item.json.body === 'string' ? item.json.body : JSON.stringify(item.json.body || {}));\nconst headers = item.json.headers || {};\nconst providedSig = headers['x-csv-signature'] || headers['X-CSV-Signature'];\nconst tsHeader = headers['x-csv-timestamp'] || headers['X-CSV-Timestamp'];\n\nif (!providedSig || typeof providedSig !== 'string') {\n  throw new Error('UNAUTHORIZED: missing x-csv-signature header');\n}\nif (!tsHeader || typeof tsHeader !== 'string') {\n  throw new Error('UNAUTHORIZED: missing x-csv-timestamp header (replay protection)');\n}\n\nconst tsNum = parseInt(tsHeader, 10);\nif (!Number.isFinite(tsNum)) {\n  throw new Error('UNAUTHORIZED: invalid x-csv-timestamp value');\n}\nconst nowSec = Math.floor(Date.now() / 1000);\nconst REPLAY_WINDOW_S = parseInt($env.CSV_UPLOAD_REPLAY_WINDOW_S || '300', 10);\nif (Math.abs(nowSec - tsNum) > REPLAY_WINDOW_S) {\n  throw new Error('UNAUTHORIZED: timestamp outside replay window (' + REPLAY_WINDOW_S + 's)');\n}\n\nconst signedPayload = tsHeader + '.' + rawBody;\nconst expected = crypto.createHmac('sha256', secret).update(signedPayload, 'utf8').digest('hex');\n\nif (providedSig.length !== expected.length) {\n  throw new Error('UNAUTHORIZED: signature length mismatch');\n}\n\nconst expBuf = Buffer.from(expected, 'utf8');\nconst provBuf = Buffer.from(providedSig, 'utf8');\nif (!crypto.timingSafeEqual(expBuf, provBuf)) {\n  throw new Error('UNAUTHORIZED: invalid signature');\n}\n\nreturn [$input.first()];"
      },
      "id": "csv-pp-1-verify",
      "name": "Verify Webhook (opt-in)",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        440,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// Per-IP sliding-window rate limit, opt-in.\n\nif ($env.RATE_LIMIT_ENABLED !== '1') {\n  return [$input.first()];\n}\n\nconst LIMIT = 60;\nconst WINDOW_MS = 5 * 60 * 1000;\nconst MAX_KEYS = 5000;\n\nconst item = $input.first();\nconst headers = item.json.headers || {};\nconst rawIp = headers['x-forwarded-for'] || headers['x-real-ip'] || 'unknown';\nconst key = String(rawIp).split(',')[0].trim();\n\nconst data = $getWorkflowStaticData('global');\ndata.rateBuckets = data.rateBuckets || {};\nconst buckets = data.rateBuckets;\nconst now = Date.now();\n\nfor (const k of Object.keys(buckets)) {\n  buckets[k] = (buckets[k] || []).filter(t => now - t < WINDOW_MS);\n  if (buckets[k].length === 0) delete buckets[k];\n}\nif (Object.keys(buckets).length > MAX_KEYS) {\n  const oldest = Object.entries(buckets).sort((a, b) => (a[1][0] || 0) - (b[1][0] || 0)).slice(0, 100);\n  for (const [k] of oldest) delete buckets[k];\n}\n\nconst hits = buckets[key] || [];\nif (hits.length >= LIMIT) {\n  throw new Error('RATE_LIMIT_EXCEEDED: ' + LIMIT + ' requests per ' + Math.round(WINDOW_MS / 60000) + ' minutes for ' + key);\n}\nbuckets[key] = [...hits, now];\n\nreturn [$input.first()];"
      },
      "id": "csv-pp-2-ratelimit",
      "name": "Rate Limit (opt-in)",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        640,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// 5-minute idempotency window on sha256(rawBody), opt-in.\n\nconst crypto = require('crypto');\n\nif ($env.IDEMPOTENCY_ENABLED !== '1') {\n  return [$input.first()];\n}\n\nconst WINDOW_MS = 5 * 60 * 1000;\nconst MAX_KEYS = 5000;\n\nconst item = $input.first();\nconst rawBody = item.json.rawBody || (typeof item.json.body === 'string' ? item.json.body : JSON.stringify(item.json.body || {}));\nconst dedupKey = crypto.createHash('sha256').update(rawBody, 'utf8').digest('hex').slice(0, 32);\n\nconst data = $getWorkflowStaticData('global');\ndata.seenKeys = data.seenKeys || {};\nconst seen = data.seenKeys;\nconst now = Date.now();\n\nfor (const k of Object.keys(seen)) {\n  if (now - seen[k] > WINDOW_MS) delete seen[k];\n}\nif (Object.keys(seen).length > MAX_KEYS) {\n  const oldest = Object.entries(seen).sort((a, b) => a[1] - b[1]).slice(0, 500);\n  for (const [k] of oldest) delete seen[k];\n}\n\nif (seen[dedupKey]) {\n  // Duplicate detected. Emit a sentinel item that the\n  // 'Skip If Duplicate' IF node routes to 'Respond Duplicate'\n  // (200 OK + { deduped: true }). Without that 200 the source\n  // provider would hold the HTTP connection until n8n's webhook\n  // timeout (default 30s) and mark delivery failed.\n  return [{ json: { skipped: true, reason: 'duplicate', dedupKey: String(dedupKey) } }];\n}\nseen[dedupKey] = now;\n\nreturn [$input.first()];\n\n// Redis variant for clustered n8n:\n// const result = await redis.set('csv-idem:' + dedupKey, '1', 'EX', 300, 'NX');\n// if (result === null) return [];"
      },
      "id": "csv-pp-3-idempotency",
      "name": "Idempotency Check (opt-in)",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        840,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// Quote-aware CSV parser. Handles quoted fields with embedded commas, embedded newlines,\n// escaped quotes (RFC 4180). BOM stripping. Configurable delimiter.\n//\n// Hard MAX_BODY_BYTES cap as defense against accidental giant uploads.\n\nconst item = $input.first();\nconst rawBody = item.json.rawBody || (typeof item.json.body === 'string' ? item.json.body : '');\nif (typeof rawBody !== 'string') {\n  throw new Error('CSV body must be a string. Set Webhook node Options -> Raw Body = true.');\n}\n\nconst MAX_BODY_BYTES = parseInt($env.MAX_BODY_BYTES || '5242880', 10);\nif (Buffer.byteLength(rawBody, 'utf8') > MAX_BODY_BYTES) {\n  throw new Error('PAYLOAD_TOO_LARGE: body exceeds MAX_BODY_BYTES=' + MAX_BODY_BYTES);\n}\n\nconst MAX_ROWS = parseInt($env.MAX_ROWS || '10000', 10);\nconst delimRaw = $env.CSV_DELIMITER || ',';\nconst delim = delimRaw === '\\\\t' ? '\\t' : delimRaw;\n\n// Strip BOM\nlet text = rawBody.charCodeAt(0) === 0xFEFF ? rawBody.slice(1) : rawBody;\n\n// Quote-aware tokenizer\nconst rows = [];\nlet field = '';\nlet row = [];\nlet inQuotes = false;\nfor (let i = 0; i < text.length; i++) {\n  const c = text[i];\n  if (inQuotes) {\n    if (c === '\"') {\n      if (text[i + 1] === '\"') { field += '\"'; i++; }\n      else { inQuotes = false; }\n    } else {\n      field += c;\n    }\n  } else {\n    if (c === '\"') { inQuotes = true; }\n    else if (c === delim) { row.push(field); field = ''; }\n    else if (c === '\\r') { /* skip CR */ }\n    else if (c === '\\n') { row.push(field); rows.push(row); row = []; field = ''; if (rows.length > MAX_ROWS + 1) break; }\n    else { field += c; }\n  }\n}\n// Tail handling\nif (field.length > 0 || row.length > 0) { row.push(field); rows.push(row); }\n\nif (rows.length === 0) throw new Error('CSV is empty');\nif (rows.length - 1 > MAX_ROWS) throw new Error('TOO_MANY_ROWS: limit is ' + MAX_ROWS);\n\nconst headers = rows[0].map(h => String(h || '').trim());\nconst dataRows = rows.slice(1);\n\nconst out = [];\nfor (let r = 0; r < dataRows.length; r++) {\n  const cells = dataRows[r];\n  if (cells.length === 1 && cells[0] === '') continue; // skip blank line\n  const obj = {};\n  for (let c = 0; c < headers.length; c++) {\n    obj[headers[c]] = cells[c] != null ? cells[c] : '';\n  }\n  out.push({ json: { rowNumber: r + 2, raw: obj } }); // +2 because header is row 1, data starts at row 2\n}\nreturn out;"
      },
      "id": "csv-2-parse",
      "name": "Parse CSV",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1040,
        60
      ],
      "onError": "continueErrorOutput"
    },
    {
      "parameters": {
        "jsCode": "// Validate + sanitize each row against VALIDATION_SCHEMA env JSON.\n// Schema shape (see README for details):\n// { columns: [{ name, required, type, pattern, minLength, maxLength, min, max, enum, default }, ...] }\n//\n// Per-row: produce { valid: bool, sanitized: {...}, errors: [...] }.\n// Per-row errors are caught and reported, never thrown. A single bad row\n// does not break the batch.\n//\n// ReDoS protection: VALIDATION_SCHEMA is operator-controlled (env var, not user-supplied).\n// We still cap regex pattern length and reject patterns containing common catastrophic-\n// backtracking shapes ((a+)+, (a|a)*, nested quantifiers) before compiling. Compilation\n// failure or refusal returns one PATTERN_INVALID error per row, never a thrown crash.\n\nconst schemaRaw = $env.VALIDATION_SCHEMA || '{\"columns\":[]}';\nlet schema;\ntry { schema = JSON.parse(schemaRaw); }\ncatch (e) { throw new Error('VALIDATION_SCHEMA env is not valid JSON: ' + e.message); }\n\nconst columns = Array.isArray(schema.columns) ? schema.columns : [];\nconst MAX_PATTERN_LEN = 200;\n// Refuse common catastrophic-backtracking shapes before compile.\n// This is a heuristic, not a complete defense; the env-var-only contract is the primary mitigation.\nconst REDOS_HINT_RE = /(\\([^)]*[+*][^)]*\\))[+*]|(\\([^)]*\\|[^)]*\\))[+*]/;\n\nconst compileSafePattern = (raw) => {\n  if (typeof raw !== 'string' || raw.length === 0) return { error: 'pattern is empty' };\n  if (raw.length > MAX_PATTERN_LEN) return { error: 'pattern length exceeds ' + MAX_PATTERN_LEN + ' chars' };\n  if (REDOS_HINT_RE.test(raw)) return { error: 'pattern matches a known catastrophic-backtracking shape' };\n  try { return { regex: new RegExp(raw) }; }\n  catch (e) { return { error: 'pattern compile failed: ' + e.message }; }\n};\n\n// Pre-compile all column patterns once so we do not pay the cost per row.\nconst compiledPatterns = {};\nfor (const col of columns) {\n  if (col && col.pattern) compiledPatterns[col.name] = compileSafePattern(col.pattern);\n}\n\nconst sanitizeString = (v) => {\n  let s = String(v == null ? '' : v);\n  s = s.replace(/[\\u0000-\\u0008\\u000B\\u000C\\u000E-\\u001F\\u007F]/g, ''); // strip control chars\n  s = s.trim().replace(/\\s+/g, ' ');\n  return s;\n};\n\nconst checkRow = (raw, rowNumber) => {\n  const sanitized = {};\n  const errors = [];\n  for (const col of columns) {\n    const name = col.name;\n    let val = raw[name];\n    if ((val == null || val === '') && col.default != null) val = col.default;\n    if ((val == null || val === '') && col.required) {\n      errors.push({ field: name, code: 'REQUIRED', message: 'missing required value' });\n      continue;\n    }\n    if (val == null || val === '') { sanitized[name] = null; continue; }\n\n    // Type coercion\n    if (col.type === 'integer') {\n      const n = parseInt(val, 10);\n      if (!Number.isFinite(n)) { errors.push({ field: name, code: 'TYPE', message: 'expected integer' }); continue; }\n      if (col.min != null && n < col.min) errors.push({ field: name, code: 'MIN', message: 'below min ' + col.min });\n      if (col.max != null && n > col.max) errors.push({ field: name, code: 'MAX', message: 'above max ' + col.max });\n      sanitized[name] = n;\n    } else if (col.type === 'number') {\n      const n = parseFloat(val);\n      if (!Number.isFinite(n)) { errors.push({ field: name, code: 'TYPE', message: 'expected number' }); continue; }\n      if (col.min != null && n < col.min) errors.push({ field: name, code: 'MIN', message: 'below min ' + col.min });\n      if (col.max != null && n > col.max) errors.push({ field: name, code: 'MAX', message: 'above max ' + col.max });\n      sanitized[name] = n;\n    } else if (col.type === 'boolean') {\n      const sLower = String(val).trim().toLowerCase();\n      if (['true', 'yes', 'y', '1'].includes(sLower)) sanitized[name] = true;\n      else if (['false', 'no', 'n', '0'].includes(sLower)) sanitized[name] = false;\n      else errors.push({ field: name, code: 'TYPE', message: 'expected boolean' });\n    } else {\n      // string default\n      const s = sanitizeString(val);\n      if (col.minLength != null && s.length < col.minLength) errors.push({ field: name, code: 'MIN_LEN', message: 'below minLength ' + col.minLength });\n      if (col.maxLength != null && s.length > col.maxLength) errors.push({ field: name, code: 'MAX_LEN', message: 'above maxLength ' + col.maxLength });\n      if (col.enum && Array.isArray(col.enum) && !col.enum.includes(s)) errors.push({ field: name, code: 'ENUM', message: 'not in allowed enum' });\n      if (col.pattern) {\n        const compiled = compiledPatterns[name];\n        if (compiled && compiled.regex) {\n          if (!compiled.regex.test(s)) errors.push({ field: name, code: 'PATTERN', message: 'does not match required pattern' });\n        } else {\n          errors.push({ field: name, code: 'PATTERN_INVALID', message: 'schema pattern rejected: ' + (compiled && compiled.error || 'unknown') });\n        }\n      }\n      sanitized[name] = s;\n    }\n  }\n  return { rowNumber, valid: errors.length === 0, sanitized, errors };\n};\n\nreturn $input.all().map(item => {\n  const j = item.json || {};\n  try {\n    const result = checkRow(j.raw || {}, j.rowNumber);\n    return { json: result };\n  } catch (e) {\n    return { json: { rowNumber: j.rowNumber, valid: false, sanitized: null, errors: [{ field: '_row', code: 'PARSE_ERROR', message: e.message }] } };\n  }\n});"
      },
      "id": "csv-3-validate",
      "name": "Validate + Sanitize Rows",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1240,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// Aggregate per-row results into a structured report.\n\nconst valid = [];\nconst invalid = [];\nfor (const item of $input.all()) {\n  const j = item.json || {};\n  if (j.valid) valid.push({ row: j.rowNumber, data: j.sanitized });\n  else invalid.push({ row: j.rowNumber, errors: j.errors });\n}\n\nconst summary = {\n  total: valid.length + invalid.length,\n  validCount: valid.length,\n  invalidCount: invalid.length,\n  validPct: (valid.length + invalid.length) > 0 ? Math.round((valid.length / (valid.length + invalid.length)) * 1000) / 10 : 0,\n  generatedAt: new Date().toISOString(),\n};\n\nreturn [{ json: { summary, valid, invalid } }];"
      },
      "id": "csv-4-report",
      "name": "Build Report",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1440,
        60
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "={{ $env.SLACK_OPS_WEBHOOK }}",
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ JSON.stringify({ text: ':bar_chart: CSV batch ' + $json.summary.total + ' rows: ' + $json.summary.validCount + ' valid (' + $json.summary.validPct + '%), ' + $json.summary.invalidCount + ' invalid' }) }}",
        "options": {}
      },
      "id": "csv-5-slack",
      "name": "Slack Summary",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        1640,
        60
      ],
      "onError": "continueErrorOutput"
    },
    {
      "parameters": {
        "respondWith": "json",
        "responseBody": "={{ JSON.stringify($json) }}",
        "options": {
          "responseCode": 200
        }
      },
      "id": "csv-6-respond",
      "name": "Respond to Client",
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1.1,
      "position": [
        1840,
        60
      ]
    },
    {
      "parameters": {
        "jsCode": "// Fallback when the Slack notification fails. Still respond OK to the client.\n\nconst input = $input.first();\nconst err = (input.json && input.json.error) || input.error || {};\nconst report = ($('Build Report').first() && $('Build Report').first().json) || { summary: {} };\n\nreturn [{\n  json: {\n    ...report,\n    slackError: { message: err.message || 'unknown error', name: err.name || 'SlackError' },\n  },\n}];"
      },
      "id": "csv-err-fallback",
      "name": "Error Fallback",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1640,
        380
      ]
    },
    {
      "parameters": {
        "respondWith": "json",
        "responseBody": "={{ JSON.stringify($json) }}",
        "options": {
          "responseCode": 200
        }
      },
      "id": "csv-err-respond",
      "name": "Error Respond to Client",
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1.1,
      "position": [
        1840,
        380
      ]
    },
    {
      "parameters": {
        "conditions": {
          "options": {
            "caseSensitive": true,
            "leftValue": "",
            "typeValidation": "strict",
            "version": 2
          },
          "conditions": [
            {
              "id": "cond-10-csv-bulk-validator-skipped",
              "leftValue": "={{ $json.skipped }}",
              "rightValue": true,
              "operator": {
                "type": "boolean",
                "operation": "true",
                "singleValue": true
              }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "10-csv-if-skip-dup",
      "name": "Skip If Duplicate",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2,
      "position": [
        1060,
        60
      ]
    },
    {
      "parameters": {
        "respondWith": "json",
        "responseBody": "={{ JSON.stringify({ ok: true, deduped: true, reason: \"duplicate\" }) }}",
        "options": {
          "responseCode": 200,
          "responseHeaders": {
            "entries": [
              {
                "name": "X-Dedup",
                "value": "1"
              }
            ]
          }
        }
      },
      "id": "10-csv-respond-duplicate",
      "name": "Respond Duplicate",
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1.1,
      "position": [
        1280,
        -120
      ]
    }
  ],
  "connections": {
    "CSV Webhook": {
      "main": [
        [
          {
            "node": "Verify Webhook (opt-in)",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Verify Webhook (opt-in)": {
      "main": [
        [
          {
            "node": "Rate Limit (opt-in)",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Rate Limit (opt-in)": {
      "main": [
        [
          {
            "node": "Idempotency Check (opt-in)",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Idempotency Check (opt-in)": {
      "main": [
        [
          {
            "node": "Skip If Duplicate",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse CSV": {
      "main": [
        [
          {
            "node": "Validate + Sanitize Rows",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Error Fallback",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Validate + Sanitize Rows": {
      "main": [
        [
          {
            "node": "Build Report",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Build Report": {
      "main": [
        [
          {
            "node": "Slack Summary",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Slack Summary": {
      "main": [
        [
          {
            "node": "Respond to Client",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Error Fallback",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Error Fallback": {
      "main": [
        [
          {
            "node": "Error Respond to Client",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Skip If Duplicate": {
      "main": [
        [
          {
            "node": "Respond Duplicate",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Parse CSV",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1"
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

How this works

Easily validate and sanitise large CSV files in bulk without manual effort, ensuring data integrity for imports into systems like CRMs or databases. This workflow suits data managers or developers handling frequent CSV uploads, preventing errors from invalid entries such as malformed emails or duplicates. It triggers via webhook to receive your CSV, then parses and checks each row against predefined rules using HTTP requests for external validations where needed, outputting a cleaned file ready for use.

Use this when processing user-submitted CSVs in web forms or APIs, especially for high-volume tasks requiring consistent formatting. Avoid it for tiny files under 100 rows, where simple tools like Excel suffice, or when real-time single-record validation is needed instead of batch. Common variations include adding custom rules for specific industries, like financial data checks, or integrating with Google Sheets for direct output.

About this workflow

CSV Bulk Validator + Sanitizer. Uses stickyNote, httpRequest, respondToWebhook. Webhook trigger; 16 nodes.

Source: https://github.com/studiomeyer-io/n8n-workflows/blob/main/templates/10-csv-bulk-validator/workflow.json — original creator credit. Request a take-down →

More General workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

General

Portfolio Orchestrator. Uses httpRequest. Webhook trigger; 59 nodes.

HTTP Request
General

jump-section: Comment Fix Pipeline. Uses httpRequest. Webhook trigger; 24 nodes.

HTTP Request
General

GitHub Issues Router (Linear / Jira / ClickUp). Uses stickyNote, httpRequest, respondToWebhook. Webhook trigger; 23 nodes.

HTTP Request
General

Form to CRM Lead Router (Pipedrive / HubSpot / Salesforce). Uses stickyNote, httpRequest, respondToWebhook. Webhook trigger; 22 nodes.

HTTP Request
General

Calendly to CRM Sync (Pipedrive / HubSpot / Salesforce). Uses stickyNote, httpRequest, respondToWebhook. Webhook trigger; 22 nodes.

HTTP Request