This tutorial covers error handling techniques in n8n workflows, including try/catch nodes, error workflows, retry logic, dead-letter queues, alerting on failures, partial recovery, and ensuring idempotency. It's for technical users already familiar with n8n basics who are building automations and need concrete steps to implement these features.

Why this matters

In automation workflows, failures like API downtime or invalid data can halt entire processes, leading to data loss, missed notifications, or cascading errors that waste time debugging. Proper error handling in n8n prevents these issues by isolating faults, retrying transient problems, and enabling recovery, ensuring workflows remain reliable even under real-world conditions where perfection is impossible.

Step-by-step

Open your n8n instance and create a new workflow. Drag in a Schedule Trigger node to start the flow, then connect it to an HTTP Request node configured to fetch data from an API endpoint, such as https://api.example.com/data with method GET. This sets up a basic flow prone to network errors.
To add basic error catching, insert a Switch node after the HTTP Request to branch on success or failure. In the Switch node, set Mode to Rules, add a rule for Boolean condition where {{$json.error}} equals false for the success path, and route errors to a separate branch. Expect the error branch to activate if the API returns a 4xx or 5xx status.
Implement try/catch logic by wrapping risky operations. Add an IF node before the HTTP Request to simulate a condition, but for true try/catch, use n8n's error workflow feature: go to Workflow Settings (click the gear icon), enable Save Manual Executions, and define an error workflow by creating a sub-workflow that triggers on errors from the main one. Link it via the On Error setting in the main workflow's settings.
For retry logic, edit the HTTP Request node: under Options, enable Retry on Fail, set Max Tries to 3, and Wait Between Tries to 5 seconds. Test by pointing to a temporarily unavailable endpoint; the node should attempt retries automatically, logging each in the execution history.
Set up a dead-letter queue for unrecoverable errors. After the error branch, add a Set node to prepare data with keys like originalPayload and errorMessage from {{$json}}, then connect to a Postgres or MongoDB node to insert into a dedicated error table. Use SQL like INSERT INTO dead_letter_queue (payload, error, timestamp) VALUES ({{JSON.stringify($json.originalPayload)}}, '{{$json.errorMessage}}', NOW()); this queues failures for later review without blocking the workflow.
Add alerting on failure: in the error branch, insert a Send Email or Slack node. For Slack, select your credential, set Channel to #alerts, and Message to Workflow failed: {{$json.errorMessage}} at {{$now}}. Configure it to trigger only on the error path; test by forcing an error to confirm the alert fires.
Enable partial recovery by using a Merge node after parallel branches. Have the success path proceed to a Function node for processing, while the error path logs via Set and merges back with Mode Wait for All. In the Function, add logic like if (items[0].json.error) { return [{json: {status: 'partial', recoveredData: items[0].json}}]; } to salvage what you can.
Ensure idempotency in operations: for nodes like HTTP Request that update resources, include a unique idempotencyKey in headers, generated via Function node with return [{json: {key: $now + '-' + $execution.id}}];. Servers supporting idempotency (e.g., Stripe APIs) will ignore duplicates on retries, preventing double-charges.
Finally, test the full workflow: execute manually, introduce an error (e.g., invalid URL), and verify retries, alerting, queuing, and recovery in the executions panel. Adjust thresholds based on logs to fine-tune resilience.

Worked example

Consider a common pattern: syncing customer data from a CRM like HubSpot to a database, which can fail due to rate limits or invalid records. The workflow starts with a Schedule Trigger running daily, connected to a HubSpot node set to Get All contacts with a limit of 100. This feeds into a Loop Over Items node to process each contact individually.

Inside the loop, a Function node validates data (e.g., checks email format with if (!item.json.email.includes('@')) throw new Error('Invalid email');), followed by an HTTP Request to your database API for upserting the contact, with idempotency via a header X-Idempotency-Key: {{ $json.contactId + '-' + $execution.id }} and retry enabled for 3 attempts on 429 errors.

If validation or API fails, it routes to an error branch: a Switch node checks error type—if transient (e.g., rate limit), it waits 1 minute via Wait node and retries; otherwise, it logs to a dead-letter queue using a PostgreSQL node inserting {contactId: $json.contactId, error: $error.message, timestamp: $now}, then sends a Slack alert with details. The main path merges via Merge node, allowing partial syncs to continue (e.g., 95/100 contacts succeed). End-to-end, if 5 fail, the workflow completes with a summary Set node outputting {success: 95, failures: 5, queued: 5}, ensuring data integrity without full halts.

Common pitfalls

Symptom: Workflows silently fail without logs, making debugging impossible. Fix: Always enable Save Data Error Executions in workflow settings and add a NoOp node in error branches to force execution history capture; this ensures errors are visible in the UI for quick triage.
Symptom: Retries exacerbate issues like rate limiting, causing more bans. Fix: Use exponential backoff in a Function node before retries, calculating wait time as Math.pow(2, attempt) * 1000 milliseconds, and check error codes to skip retries on permanent failures like 400 Bad Request.
Symptom: Partial failures lead to inconsistent data states across systems. Fix: Implement transactions where possible (e.g., via database nodes with autocommit: false) or use compensating actions in error branches, like a cleanup HTTP Request to rollback changes if the main operation fails midway.
Symptom: Alerts flood channels during testing or bursts. Fix: Add deduplication in the alerting Function node using a cache like Redis to track recent errors, only notifying if the same issue persists beyond a threshold (e.g., 3 in 5 minutes).
Symptom: Idempotency keys collide on high-volume workflows, causing unexpected duplicates. Fix: Generate keys with sufficient entropy, combining UUIDs via const { v4: uuidv4 } = require('uuid'); return [{json: {key: uuidv4() + '-' + $json.id}}]; in a Code node, and store used keys temporarily to detect reuse.

Related workflows in the catalog

Explore the n8n workflow catalog for importable templates like "Error Handling with Retry and Notifications," which demonstrates try/catch with Slack alerts, or "Dead Letter Queue for API Syncs," showing PostgreSQL queuing for failed integrations. With over 14,000+ workflows available, search for "error handling" or "retry logic" to find patterns for alerting via email or partial recovery in ETL processes. These can be imported directly into your instance and customised to fit your automations.

Error handling in n8n workflows

Why this matters

Step-by-step

Worked example

Common pitfalls

Related workflows in the catalog