AutomationFlowsAI & RAG › AI-Powered Web Scraper

AI-Powered Web Scraper

Original n8n title: Web Scraper with AI

Web Scraper with AI. Uses httpRequest, lmChatGoogleGemini, informationExtractor. Event-driven trigger; 5 nodes.

Event trigger★★☆☆☆ complexityAI-powered5 nodesHTTP RequestGoogle Gemini ChatInformation Extractor
AI & RAG Trigger: Event Nodes: 5 Complexity: ★★☆☆☆ AI nodes: yes Added:

This workflow follows the HTTP Request → Informationextractor recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Web Scraper with AI",
  "nodes": [
    {
      "parameters": {},
      "type": "n8n-nodes-base.manualTrigger",
      "typeVersion": 1,
      "position": [
        192,
        -64
      ],
      "id": "fad83590-8568-42a9-b449-3fb77bd1b843",
      "name": "When clicking \u2018Execute workflow\u2019"
    },
    {
      "parameters": {
        "url": "https://books.toscrape.com",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        416,
        -64
      ],
      "id": "fb3f547b-c6b8-4149-8fbd-a8e899d927c7",
      "name": "HTTP Request"
    },
    {
      "parameters": {
        "options": {}
      },
      "type": "n8n-nodes-base.convertToFile",
      "typeVersion": 1.1,
      "position": [
        992,
        -64
      ],
      "id": "c0d201b0-3726-4beb-91d2-de51e3ddb37c",
      "name": "Convert to File"
    },
    {
      "parameters": {
        "options": {}
      },
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "typeVersion": 1,
      "position": [
        720,
        160
      ],
      "id": "39dc544b-ca29-43f0-84df-a8e947c54a67",
      "name": "Google Gemini Chat Model",
      "credentials": {
        "googlePalmApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "text": "={{ $json.data }}",
        "schemaType": "fromJson",
        "jsonSchemaExample": "[\n  {\n    \"title\": \"Book Title 1\",\n    \"url\": \"https://example.com/book1\"\n  },\n  {\n    \"title\": \"Book Title 2\",\n    \"url\": \"https://example.com/book2\"\n  },\n  {\n    \"title\": \"Book Title 3\",\n    \"url\": \"https://example.com/book3\"\n  }\n]\n",
        "options": {
          "systemPromptTemplate": "You are an expert extraction algorithm.\nOnly extract relevant information from the text.\nIf you do not know the value of an attribute asked to extract, you may omit the attribute's value."
        }
      },
      "type": "@n8n/n8n-nodes-langchain.informationExtractor",
      "typeVersion": 1.2,
      "position": [
        640,
        -64
      ],
      "id": "b79f969d-ac95-4810-afd5-0c83349cd77d",
      "name": "Information Extractor"
    }
  ],
  "connections": {
    "When clicking \u2018Execute workflow\u2019": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "Information Extractor",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Google Gemini Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Information Extractor",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Information Extractor": {
      "main": [
        [
          {
            "node": "Convert to File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "b154c0bb-38fb-4f25-9a50-3862710ee1b6",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "id": "mFx1MlePYGPZE2mA",
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Web Scraper with AI. Uses httpRequest, lmChatGoogleGemini, informationExtractor. Event-driven trigger; 5 nodes.

Source: https://github.com/021up/n8n-learning/blob/main/ITHome/d10/Web_Scraper_with_AI.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

This workflow demonstrates how to fetch data specifically from Goodreads web pages using Bright Data and then extract specific information (quotes) from that data using a Google Gemini AI model. The w

Google Gemini Chat, Information Extractor, HTTP Request
AI & RAG

This n8n template automates scraping content from Skool communities using the Olostep API. It collects structured data from Skool pages and stores it in a clean format, making it easy to analyze commu

N8N Nodes Olostep, Form Trigger, HTTP Request +3
AI & RAG

automation_financial_recording. Uses telegramTrigger, telegram, googleGemini, lmChatGoogleGemini. Event-driven trigger; 35 nodes.

Telegram Trigger, Telegram, Google Gemini +4
AI & RAG

automation_financial_recording. Uses telegramTrigger, telegram, googleGemini, lmChatGoogleGemini. Event-driven trigger; 35 nodes.

Telegram Trigger, Telegram, Google Gemini +4
AI & RAG

automation_financial_recording. Uses telegramTrigger, telegram, googleGemini, lmChatGoogleGemini. Event-driven trigger; 35 nodes.

Telegram Trigger, Telegram, Google Gemini +4