This workflow corresponds to n8n.io template #3866 — we link there as the canonical source.

This workflow follows the HTTP Request → Readwritefile recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json

{
  "id": "OjwmaLrXhW4pO5ph",
  "name": "Structured Bulk Data Extract with Bright Data Web Scraper",
  "tags": [
    {
      "id": "Kujft2FOjmOVQAmJ",
      "name": "Engineering",
      "createdAt": "2025-04-09T01:31:00.558Z",
      "updatedAt": "2025-04-09T01:31:00.558Z"
    },
    {
      "id": "ZOwtAMLepQaGW76t",
      "name": "Building Blocks",
      "createdAt": "2025-04-13T15:23:40.462Z",
      "updatedAt": "2025-04-13T15:23:40.462Z"
    }
  ],
  "nodes": [
    {
      "id": "1bdca5ae-1e56-4cf2-a8dc-e135a6a2dfec",
      "name": "When clicking \u2018Test workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -900,
        -395
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "533968cd-1329-4a86-8875-478600ed82b7",
      "name": "If",
      "type": "n8n-nodes-base.if",
      "position": [
        200,
        -470
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "6a7e5360-4cb5-4806-892e-5c85037fa71c",
              "operator": {
                "type": "string",
                "operation": "equals"
              },
              "leftValue": "={{ $json.status }}",
              "rightValue": "ready"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "83991fdf-0402-4de3-bbb5-7050e3e9fb62",
      "name": "Set Snapshot Id",
      "type": "n8n-nodes-base.set",
      "position": [
        -240,
        -395
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "2c3369c6-9206-45d7-9349-f577baeaf189",
              "name": "snapshot_id",
              "type": "string",
              "value": "={{ $json.snapshot_id }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "408a36af-decb-49b3-a95e-a2df0b6eea5f",
      "name": "Download Snapshot",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        640,
        -520
      ],
      "parameters": {
        "url": "=https://api.brightdata.com/datasets/v3/snapshot/{{ $json.snapshot_id }}",
        "options": {
          "timeout": 10000
        },
        "sendQuery": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "queryParameters": {
          "parameters": [
            {
              "name": "format",
              "value": "json"
            }
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "9d6cd882-c287-46ca-bc1e-df6b995fc422",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        420,
        -295
      ],
      "parameters": {
        "amount": 30
      },
      "typeVersion": 1.1
    },
    {
      "id": "c9cf847a-6399-4c93-a901-30f1c0e7408a",
      "name": "Check on the errors",
      "type": "n8n-nodes-base.if",
      "position": [
        420,
        -520
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "b267071c-7102-407b-a98d-f613bcb1a106",
              "operator": {
                "type": "string",
                "operation": "equals"
              },
              "leftValue": "={{ $json.errors.toString() }}",
              "rightValue": "0"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "b648614e-c33e-4818-8348-e95df56928c7",
      "name": "Check Snapshot Status",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -20,
        -395
      ],
      "parameters": {
        "url": "=https://api.brightdata.com/datasets/v3/progress/{{ $json.snapshot_id }}",
        "options": {},
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "headerParameters": {
          "parameters": [
            {}
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "408a1584-666f-471e-bfcd-c4d857319688",
      "name": "Initiate a Webhook Notification",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1080,
        -520
      ],
      "parameters": {
        "url": "https://webhook.site/daf9d591-a130-4010-b1d3-0c66f8fcf467",
        "options": {},
        "sendBody": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "response",
              "value": "={{ $json.data[0] }}"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "6548a794-a4fd-4050-b07d-bc7ca4517882",
      "name": "Aggregate JSON Response",
      "type": "n8n-nodes-base.aggregate",
      "position": [
        860,
        -520
      ],
      "parameters": {
        "options": {},
        "aggregate": "aggregateAllItemData"
      },
      "typeVersion": 1
    },
    {
      "id": "c84e195c-edd2-4f59-8986-516d116b7352",
      "name": "Set Dataset Id, Request URL",
      "type": "n8n-nodes-base.set",
      "position": [
        -680,
        -400
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "c16061c8-c829-4bd3-b335-e79c605665f2",
              "name": "dataset_id",
              "type": "string",
              "value": "gd_l7q7dkf244hwjntr0"
            },
            {
              "id": "a4594c55-e39e-4a9e-80d6-d39370001e20",
              "name": "request",
              "type": "string",
              "value": "[{     \"url\": \"https://www.amazon.com/Quencher-FlowState-Stainless-Insulated-Smoothie/dp/B0CRMZHDG8\"   }]"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "ceae108e-ed78-40c5-8e58-7013591ccaad",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -900,
        -700
      ],
      "parameters": {
        "width": 520,
        "height": 280,
        "content": "## Note\n\nDeals with the Amazon web scraping by utilizing Bright Data Web Scraper Product.\n\n\n**Please make sure to set the Bright Data \n -> Dataset Id, Request URL and update the Webhook Notification URL**\n\nRefer \n- https://brightdata.com/products/web-scraper/ai\n- https://brightdata.com/products/web-scraper"
      },
      "typeVersion": 1
    },
    {
      "id": "1f55cffa-abd9-437b-bc9d-3fe0d8b02454",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -120,
        -600
      ],
      "parameters": {
        "color": 5,
        "width": 720,
        "height": 500,
        "content": "## Wait until the Snapshot is ready"
      },
      "typeVersion": 1
    },
    {
      "id": "d8ba0f62-80a9-4e66-b70c-086ee5992df6",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -900,
        -220
      ],
      "parameters": {
        "color": 4,
        "width": 660,
        "content": "## Who can benefit?\nData analysts, scientists, engineers, and developers seeking efficient methods to collect and analyze web data for AI, ML, big data applications, and more will find Scraper APIs particularly beneficial."
      },
      "typeVersion": 1
    },
    {
      "id": "7fdffafd-f256-4760-b001-a42b5198dbad",
      "name": "Create a binary data",
      "type": "n8n-nodes-base.function",
      "position": [
        1100,
        -720
      ],
      "parameters": {
        "functionCode": "items[0].binary = {\n  data: {\n    data: new Buffer(JSON.stringify(items[0].json, null, 2)).toString('base64')\n  }\n};\nreturn items;"
      },
      "typeVersion": 1
    },
    {
      "id": "934ab31a-cfb9-4e97-8d86-92cd95dd219c",
      "name": "Write the file to disk",
      "type": "n8n-nodes-base.readWriteFile",
      "position": [
        1320,
        -720
      ],
      "parameters": {
        "options": {},
        "fileName": "d:\\bulk_data.json",
        "operation": "write"
      },
      "typeVersion": 1
    },
    {
      "id": "1130523a-b598-425e-acf1-417ae8699f66",
      "name": "HTTP Request to the specified URL",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -460,
        -395
      ],
      "parameters": {
        "url": "https://api.brightdata.com/datasets/v3/trigger",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.request }}",
        "sendBody": true,
        "sendQuery": true,
        "sendHeaders": true,
        "specifyBody": "json",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "queryParameters": {
          "parameters": [
            {
              "name": "dataset_id",
              "value": "={{ $json.dataset_id }}"
            },
            {
              "name": "format",
              "value": "json"
            },
            {
              "name": "uncompressed_webhook",
              "value": "true"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {}
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "8fb2eb85-ffd6-4632-9668-00f29bc91c34",
  "connections": {
    "If": {
      "main": [
        [
          {
            "node": "Check on the errors",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait": {
      "main": [
        [
          {
            "node": "Check Snapshot Status",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Snapshot Id": {
      "main": [
        [
          {
            "node": "Check Snapshot Status",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download Snapshot": {
      "main": [
        [
          {
            "node": "Aggregate JSON Response",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check on the errors": {
      "main": [
        [
          {
            "node": "Download Snapshot",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Create a binary data": {
      "main": [
        [
          {
            "node": "Write the file to disk",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Snapshot Status": {
      "main": [
        [
          {
            "node": "If",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate JSON Response": {
      "main": [
        [
          {
            "node": "Initiate a Webhook Notification",
            "type": "main",
            "index": 0
          },
          {
            "node": "Create a binary data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Dataset Id, Request URL": {
      "main": [
        [
          {
            "node": "HTTP Request to the specified URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request to the specified URL": {
      "main": [
        [
          {
            "node": "Set Snapshot Id",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \u2018Test workflow\u2019": {
      "main": [
        [
          {
            "node": "Set Dataset Id, Request URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

httpHeaderAuth

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

The Async Structured Bulk Data Extract with Bright Data Web Scraper workflow is designed for data engineers, market researchers, competitive intelligence teams, and automation developers who need to programmatically collect and structure high-volume data from the web using…

Source: https://n8n.io/workflows/3866/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

Wttj Job Apply - Workflow2

WTTJ job apply - Workflow2. Uses telegramTrigger, postgres, agent, lmChatVercelAiGateway. Event-driven trigger; 24 nodes.

Telegram Trigger, Postgres, Agent +5

Web Scraping

Automate Data Extraction with Zyte AI (products, Jobs, Articles & More)

This workflow uses the Zyte API to automatically detect and extract structured data from E-commerce sites, Articles, Job Boards, and Search Engine Results (SERP) - no custom CSS selectors required.

Form Trigger, HTTP Request, Form

Web Scraping

Firecrawl URL List Mini-batch to Resilient Analyzer

04 - Firecrawl URL List Mini-Batch to Resilient Analyzer. Uses googleSheets, httpRequest, executeWorkflowTrigger. Event-driven trigger; 42 nodes.

Google Sheets, HTTP Request, Execute Workflow Trigger

Web Scraping

Extract & Enrich Linkedin Comments to Leads with Apify → Google Sheets/csv

Automate LinkedIn lead generation by scraping comments from targeted posts and enriching profiles with detailed data

Form Trigger, HTTP Request, Google Sheets

Web Scraping

Find Top-performing Instagram Reels & Save Insights to Notion with Gemini & Apify

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Notion, @Apify/N8N Nodes Apify, HTTP Request

Bright Data Bulk Web Scraping & Webhook Alerts

The workflow JSON

Credentials you'll need

About this workflow

Related workflows