AutomationFlowsAI & RAG › AI Webpage Scraper and Summarizer

AI Webpage Scraper and Summarizer

Original n8n title: Scrape and Summarize Webpages with AI

Scrape And Summarize Webpages With Ai. Uses manualTrigger, httpRequest, html, stickyNote. Event-driven trigger; 15 nodes.

Event trigger★★★★☆ complexityAI-powered15 nodesHTTP RequestDocument Default Data LoaderText Splitter Recursive Character Text SplitterOpenAI ChatChain Summarization
AI & RAG Trigger: Event Nodes: 15 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow follows the Chainsummarization → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "nodes": [
    {
      "id": "67850bd7-f9f4-4d5b-8c9e-bd1451247ba6",
      "name": "When clicking \"Execute Workflow\"",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -740,
        1000
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "0d9133f9-b6d3-4101-95c6-3cd24cdb70c3",
      "name": "Fetch essay list",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -520,
        1000
      ],
      "parameters": {
        "url": "http://www.paulgraham.com/articles.html",
        "options": {}
      },
      "typeVersion": 4.1
    },
    {
      "id": "ee634297-a456-4f70-a995-55b02950571e",
      "name": "Extract essay names",
      "type": "n8n-nodes-base.html",
      "position": [
        -300,
        1000
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "dataPropertyName": "=data",
        "extractionValues": {
          "values": [
            {
              "key": "essay",
              "attribute": "href",
              "cssSelector": "table table a",
              "returnArray": true,
              "returnValue": "attribute"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "83d75693-dbb8-44c4-8533-da06f611c59c",
      "name": "Fetch essay texts",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        360,
        1000
      ],
      "parameters": {
        "url": "=http://www.paulgraham.com/{{ $json.essay }}",
        "options": {}
      },
      "typeVersion": 4.1
    },
    {
      "id": "151022b5-8570-4176-bf3f-137f27ac7036",
      "name": "Extract title",
      "type": "n8n-nodes-base.html",
      "position": [
        700,
        700
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "title",
              "cssSelector": "title"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "07bcf095-3c4d-4a72-9bcb-341411750ff5",
      "name": "Clean up",
      "type": "n8n-nodes-base.set",
      "position": [
        1360,
        980
      ],
      "parameters": {
        "fields": {
          "values": [
            {
              "name": "title",
              "stringValue": "={{ $json.title }}"
            },
            {
              "name": "summary",
              "stringValue": "={{ $json.response.text }}"
            },
            {
              "name": "url",
              "stringValue": "=http://www.paulgraham.com/{{ $('Limit to first 3').item.json.essay }}"
            }
          ]
        },
        "include": "none",
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "11285de0-3c5d-4296-a322-9b7585af9acc",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -580,
        920
      ],
      "parameters": {
        "width": 1071.752021563343,
        "height": 285.66037735849045,
        "content": "## Scrape latest Paul Graham essays"
      },
      "typeVersion": 1
    },
    {
      "id": "c32f905d-dd7a-4b68-bbe0-dd8115ee0944",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        620,
        920
      ],
      "parameters": {
        "width": 465.3908355795153,
        "height": 606.7924528301882,
        "content": "## Summarize them with GPT"
      },
      "typeVersion": 1
    },
    {
      "id": "29d264f4-df6d-4a41-ab38-58e1b1becc9a",
      "name": "Split out into items",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        -80,
        1000
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "essay"
      },
      "typeVersion": 1
    },
    {
      "id": "ccfa3a1d-f170-44b4-a1f2-3573c88cae98",
      "name": "Limit to first 3",
      "type": "n8n-nodes-base.limit",
      "position": [
        140,
        1000
      ],
      "parameters": {
        "maxItems": 3
      },
      "typeVersion": 1
    },
    {
      "id": "c3d05068-9d1a-4ef5-8249-e7384dc617ee",
      "name": "Default Data Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        820,
        1200
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "db75adad-cb16-4e72-b16e-34684a733b05",
      "name": "Recursive Character Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
      "position": [
        820,
        1340
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "022cc091-9b4c-45c2-bc8e-4037ec2d0d60",
      "name": "OpenAI Chat Model1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        680,
        1200
      ],
      "parameters": {
        "model": "gpt-4o-mini",
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "cda47bb7-36c5-4d15-a1ef-0c66b1194825",
      "name": "Merge",
      "type": "n8n-nodes-base.merge",
      "position": [
        1160,
        980
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combineBy": "combineByPosition"
      },
      "typeVersion": 3
    },
    {
      "id": "28144e4c-e425-428d-b3d1-f563bfd4e5b3",
      "name": "Summarization Chain",
      "type": "@n8n/n8n-nodes-langchain.chainSummarization",
      "position": [
        720,
        1000
      ],
      "parameters": {
        "options": {},
        "operationMode": "documentLoader"
      },
      "typeVersion": 2
    }
  ],
  "connections": {
    "Merge": {
      "main": [
        [
          {
            "node": "Clean up",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract title": {
      "main": [
        [
          {
            "node": "Merge",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch essay list": {
      "main": [
        [
          {
            "node": "Extract essay names",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Limit to first 3": {
      "main": [
        [
          {
            "node": "Fetch essay texts",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch essay texts": {
      "main": [
        [
          {
            "node": "Extract title",
            "type": "main",
            "index": 0
          },
          {
            "node": "Summarization Chain",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model1": {
      "ai_languageModel": [
        [
          {
            "node": "Summarization Chain",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Default Data Loader": {
      "ai_document": [
        [
          {
            "node": "Summarization Chain",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Extract essay names": {
      "main": [
        [
          {
            "node": "Split out into items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Summarization Chain": {
      "main": [
        [
          {
            "node": "Merge",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "Split out into items": {
      "main": [
        [
          {
            "node": "Limit to first 3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \"Execute Workflow\"": {
      "main": [
        [
          {
            "node": "Fetch essay list",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Recursive Character Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Default Data Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

How this works

This workflow enables you to quickly scrape content from webpages, such as essays or articles, and generate concise AI-powered summaries, saving hours of manual reading and note-taking. It is ideal for researchers, content creators, or analysts who need to process multiple online sources efficiently without sifting through dense text. The key step involves using the HTTP Request node to fetch webpage data, followed by OpenAI's language model to summarise the extracted content intelligently.

Use this workflow when dealing with event-driven tasks like on-demand research or periodic content monitoring, particularly for text-heavy sites that respond well to HTML extraction. Avoid it for dynamic, JavaScript-rendered pages or non-text media, where scraping may fail without additional tools. Common variations include adapting it to summarise news feeds via RSS integrations or chaining it with email nodes to distribute summaries automatically.

About this workflow

Scrape And Summarize Webpages With Ai. Uses manualTrigger, httpRequest, html, stickyNote. Event-driven trigger; 15 nodes.

Source: https://github.com/Zie619/n8n-workflows — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

n8n-4-1: Qdrant. Uses vectorStoreQdrant, embeddingsOpenAi, textClassifier, chainSummarization. Event-driven trigger; 27 nodes.

Qdrant Vector Store, OpenAI Embeddings, Text Classifier +11
AI & RAG

This workflow transforms any webpage into an AI-narrated audio summary delivered via WhatsApp: Receive URL - WhatsApp Trigger captures incoming messages and passes them to URL extraction Extract & val

WhatsApp, OpenAI, WhatsApp Trigger +6
AI & RAG

This workflow integrates both web scraping and NLP functionalities. It uses HTML parsing to extract links, HTTP requests to fetch essay content, and AI-based summarization using GPT-4o. It's an excell

HTTP Request, OpenAI Chat, Chain Summarization +2
AI & RAG

scrape-and-summarize-webpages-with-ai. Uses httpRequest, lmChatOpenAi, chainSummarization, documentDefaultDataLoader. Event-driven trigger; 16 nodes.

HTTP Request, OpenAI Chat, Chain Summarization +2
AI & RAG

21-scrape-and-summarize-webpages-with-ai. Uses httpRequest, lmChatOpenAi, chainSummarization, documentDefaultDataLoader. Event-driven trigger; 16 nodes.

HTTP Request, OpenAI Chat, Chain Summarization +2