AutomationFlowsWeb Scraping › Scrape URL with HTTP Request

Scrape URL with HTTP Request

Original n8n title: Node - Scrape URL

Node - Scrape Url. Uses executeWorkflowTrigger, httpRequest. Event-driven trigger; 2 nodes.

Event trigger★☆☆☆☆ complexity2 nodesExecute Workflow TriggerHTTP Request
Web Scraping Trigger: Event Nodes: 2 Complexity: ★☆☆☆☆ Added:

This workflow follows the Execute Workflow Trigger → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Node - Scrape Url",
  "nodes": [
    {
      "parameters": {
        "workflowInputs": {
          "values": [
            {
              "name": "url"
            }
          ]
        }
      },
      "type": "n8n-nodes-base.executeWorkflowTrigger",
      "typeVersion": 1.1,
      "position": [
        0,
        0
      ],
      "id": "54919f44-9c0e-4bb3-b9ae-55618cc14cad",
      "name": "workflow_trigger"
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://api.firecrawl.dev/v1/scrape",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"url\": \"{{ $json.url }}\",\n  \"formats\": [\"json\", \"markdown\", \"rawHtml\", \"links\"],\n  \"excludeTags\": [\"iframe\", \"nav\", \"header\", \"footer\"],\n  \"onlyMainContent\": true,\n  \"jsonOptions\": {\n    \"prompt\": \"Identify the main content of the text (i.e., the article or newsletter body). Provide the exact text for that main content verbatim, without summarizing or rewriting any part of it. Exclude all non-essential elements such as banners, headers, footers, calls to action, ads, or purely navigational text. Format this output as markdown using appropriate '#' characters as heading levels. Exclude any promotional or sponsored content on your output. Additionally, you must identify and extract the image urls within this main content. These images must be inside the main content of the page so you must exclude small logo images, icons, avatars and other images which aren't a core part of the main content. The images you extract should at least have a width of 600 pixels (px) so it can be included on our content.\",\n    \"schema\": {\n    \"type\": \"object\",\n      \"properties\": {\n        \"content\": {\n          \"type\": \"string\",\n          \"description\": \"The exact verbatim main text content of the web page in markdown format.\"\n        },\n        \"main_content_image_urls\": {\n          \"type\": \"array\",\n          \"items\": {\n            \"type\": \"string\",\n            \"description\": \"An image url that appears within the main content of the web page. This image must be inside the main content of the page so you must exclude small logo images, icons, avatars and other images which aren't a core part of the main content. The image should be at least 600px in width.\"\n          },\n          \"description\": \"An array of the exact image urls that appear within the main content of the web page. Extra images such as icons and images not relevant to the main content MUST be excluded.\"\n        }\n      },\n      \"required\": [\"content\", \"main_content_image_urls\"]\n    }\n  }\n}",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        380,
        0
      ],
      "id": "6846ab99-46e8-44f7-8eeb-afe6d82ff768",
      "name": "scrape_url",
      "retryOnFail": true,
      "maxTries": 3,
      "waitBetweenTries": 5000,
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "onError": "continueRegularOutput"
    }
  ],
  "connections": {
    "workflow_trigger": {
      "main": [
        [
          {
            "node": "scrape_url",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "scrape_url": {
      "main": [
        []
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1",
    "callerPolicy": "any"
  },
  "versionId": "fde7f25b-35dc-455b-9949-35c581be414d",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "id": "qVEM2rCD1jlJPeRs",
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Node - Scrape Url. Uses executeWorkflowTrigger, httpRequest. Event-driven trigger; 2 nodes.

Source: https://github.com/VasilisPlavos/Learn/blob/906c45384956c575c32f82e5baef5b2f4bfcc9bb/automations/n8n/lucaswalter-n8n-ai-automations/firecrawl_scrape_url.json — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

Tool - Serper Crawl URL. Uses httpRequest, executeWorkflowTrigger. Event-driven trigger; 3 nodes.

HTTP Request, Execute Workflow Trigger
Web Scraping

Tool - Serper Crawl URL. Uses httpRequest, executeWorkflowTrigger. Event-driven trigger; 3 nodes.

HTTP Request, Execute Workflow Trigger
Web Scraping

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Execute Workflow Trigger, HTTP Request, Stop And Error +1
Web Scraping

Automate Sales Meeting Prep With Ai & Apify Sent To Whatsapp. Uses gmail, googleCalendar, lmChatOpenAi, informationExtractor. Event-driven trigger; 61 nodes.

Gmail, Google Calendar, OpenAI Chat +5
Web Scraping

This n8n template builds a meeting assistant that compiles timely reminders of upcoming meetings filled with email history and recent LinkedIn activity of other people on the invite. This is then disc

Gmail, Google Calendar, OpenAI Chat +5