AutomationFlowsAI & RAG › Crawl4AI Web Scraping Agent

Crawl4AI Web Scraping Agent

Original n8n title: Crawl4ai Agent

Crawl4AI Agent. Uses httpRequest, xml, documentDefaultDataLoader, textSplitterCharacterTextSplitter. Event-driven trigger; 23 nodes.

Event trigger★★★★☆ complexityAI-powered23 nodesHTTP RequestXMLDocument Default Data LoaderText Splitter Character Text SplitterOpenAI EmbeddingsChat TriggerOpenAI ChatMemory Postgres Chat
AI & RAG Trigger: Event Nodes: 23 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow follows the Agent → Chat Trigger recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Crawl4AI Agent",
  "nodes": [
    {
      "parameters": {},
      "id": "b90d81a3-0920-419e-bdee-e227ac106383",
      "name": "When clicking \u2018Test workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "typeVersion": 1,
      "position": [
        760,
        300
      ]
    },
    {
      "parameters": {
        "url": "https://ai.pydantic.dev/sitemap.xml",
        "options": {}
      },
      "id": "e6033590-2727-4b81-a27b-3dd45dcfbb79",
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        960,
        160
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "1db8ab3d-4594-45cf-8256-d05d7f7697f7",
      "name": "XML",
      "type": "n8n-nodes-base.xml",
      "typeVersion": 1,
      "position": [
        1160,
        360
      ]
    },
    {
      "parameters": {
        "fieldToSplitOut": "urlset.url",
        "options": {}
      },
      "id": "8eeca9bb-0c16-4ebf-b9f4-db481e963c41",
      "name": "Split Out",
      "type": "n8n-nodes-base.splitOut",
      "typeVersion": 1,
      "position": [
        1320,
        160
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "460dcd6c-d26c-407c-ad06-016cc800609d",
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "typeVersion": 3,
      "position": [
        1560,
        260
      ]
    },
    {
      "parameters": {},
      "id": "8d35dd97-4926-4c22-989a-008d974f9799",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "typeVersion": 1.1,
      "position": [
        1980,
        260
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://seashell-app-gvc6l.ondigitalocean.app/crawl",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendBody": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "urls",
              "value": "={{ $json.loc }}"
            },
            {
              "name": "priority",
              "value": "10"
            }
          ]
        },
        "options": {}
      },
      "id": "b644ca6c-33fd-40bb-902f-79025d65b1a3",
      "name": "HTTP Request1",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        1780,
        260
      ],
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "url": "=https://seashell-app-gvc6l.ondigitalocean.app/task/{{ $json.task_id }}",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "options": {
          "timeout": 5000
        }
      },
      "id": "f85a925e-348a-4616-ae61-9e5fa60ac8c2",
      "name": "HTTP Request2",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        2200,
        260
      ],
      "retryOnFail": true,
      "waitBetweenTries": 5000,
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "conditions": {
          "options": {
            "caseSensitive": true,
            "leftValue": "",
            "typeValidation": "strict",
            "version": 2
          },
          "conditions": [
            {
              "id": "9d90c1ce-590e-40a5-ae8c-d92326032975",
              "leftValue": "={{ $json.status }}",
              "rightValue": "completed",
              "operator": {
                "type": "string",
                "operation": "equals"
              }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "83e7b017-4196-4e59-b255-921f88b2c5ef",
      "name": "If",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2.2,
      "position": [
        2420,
        260
      ]
    },
    {
      "parameters": {
        "jsonMode": "expressionData",
        "jsonData": "={{ $json.result.markdown }}",
        "options": {
          "metadata": {
            "metadataValues": [
              {
                "name": "page",
                "value": "={{ $json.result.url }}"
              }
            ]
          }
        }
      },
      "id": "57cd5ed4-0be0-47b7-b278-84192b28c1f0",
      "name": "Default Data Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "typeVersion": 1,
      "position": [
        2800,
        260
      ]
    },
    {
      "parameters": {
        "chunkSize": 5000
      },
      "id": "36b4f50d-2eef-419e-b424-e3203afc9980",
      "name": "Character Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterCharacterTextSplitter",
      "typeVersion": 1,
      "position": [
        2940,
        400
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "bd5af242-1d0c-445f-af84-07e337bfa768",
      "name": "Embeddings OpenAI",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "typeVersion": 1.1,
      "position": [
        2640,
        260
      ],
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "assignments": {
          "assignments": [
            {
              "id": "f2bcdb54-e1fe-4670-99aa-6eec973bf5f1",
              "name": "task_id",
              "value": "={{ $('HTTP Request1').item.json.task_id }}",
              "type": "string"
            }
          ]
        },
        "options": {}
      },
      "id": "556103b7-217d-4247-8aa0-cb45ede76b3b",
      "name": "Edit Fields",
      "type": "n8n-nodes-base.set",
      "typeVersion": 3.4,
      "position": [
        2660,
        460
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "b4054e7f-b429-4f4c-9d55-3004bfd9e1ce",
      "name": "When chat message received",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "typeVersion": 1.1,
      "position": [
        1420,
        -500
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "105b2092-36d3-45d6-9686-673f886933f8",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "typeVersion": 1,
      "position": [
        1580,
        -280
      ],
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {},
      "id": "1c33efa9-4492-4579-987d-ed0c9d56ecce",
      "name": "Postgres Chat Memory",
      "type": "@n8n/n8n-nodes-langchain.memoryPostgresChat",
      "typeVersion": 1.3,
      "position": [
        1720,
        -280
      ],
      "credentials": {
        "postgres": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "name": "pydantic_ai_docs",
        "description": "Retrieves data related to Pydantic AI using their documentation."
      },
      "id": "95207afc-1362-401f-9ea0-4d9c61bf9b13",
      "name": "Vector Store Tool",
      "type": "@n8n/n8n-nodes-langchain.toolVectorStore",
      "typeVersion": 1,
      "position": [
        2060,
        -380
      ]
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "5c1bfc8e-b208-4c22-93b5-4e66f45a495a",
      "name": "Embeddings OpenAI1",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "typeVersion": 1.1,
      "position": [
        1820,
        -80
      ],
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "eb940dd5-56e6-4e04-9534-2f910b5d6c8e",
      "name": "OpenAI Chat Model1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "typeVersion": 1,
      "position": [
        2200,
        -200
      ],
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "mode": "insert",
        "tableName": {
          "__rl": true,
          "value": "documents",
          "mode": "list",
          "cachedResultName": "documents"
        },
        "options": {
          "queryName": "match_documents"
        }
      },
      "id": "45ca72cc-063c-4dfa-9957-fd9d56647374",
      "name": "Supabase Vector Store",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreSupabase",
      "typeVersion": 1,
      "position": [
        2660,
        40
      ],
      "credentials": {
        "supabaseApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "options": {}
      },
      "id": "c697f5fe-6d78-4b93-b4f9-a86b7b0aea03",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "typeVersion": 1.7,
      "position": [
        1680,
        -500
      ]
    },
    {
      "parameters": {
        "tableName": {
          "__rl": true,
          "value": "documents",
          "mode": "list",
          "cachedResultName": "documents"
        },
        "options": {
          "queryName": "match_documents"
        }
      },
      "id": "2d9d0203-c24b-4e5e-9c5f-d362bd85b966",
      "name": "Supabase Vector Store1",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreSupabase",
      "typeVersion": 1,
      "position": [
        1860,
        -220
      ],
      "credentials": {
        "supabaseApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "content": "# n8n + Crawl4AI Agent\n\n## Author: [Cole Medin](https://www.youtube.com/@ColeMedin)\n\nThis AI agent demonstrates how to use a Docker deployment of Crawl4AI to leverage this incredible open source web scraping tool directly in n8n.\n\nThe prerequisite for this workflow is that you have Crawl4AI hosted in a Docker container following these [instructions in the their docs](https://docs.crawl4ai.com/core/docker-deploymeny/).\n\n## How to use this workflow\n\n1. Execute the bottom workflow by clicking on \"Test workflow\". This will ingest all the Pydantic AI documentation into the Supabase DB for RAG.\n\n2. Chat with the agent with the \"Chat\" button - it'll be able to answer questions about Pydantic AI using the documentation as its source!\n\n## Extend this workflow!\n\nThis is just a starting point showing you how to use Crawl4AI in n8n! Feel free to take this along with the Crawl4AI documentation to run wild with building RAG AI agents. The possibilities with this setup are endless!",
        "height": 613.6610941618816,
        "width": 589.875,
        "color": 6
      },
      "id": "71f6f900-60a0-4d15-85f1-88b30f56c492",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        660,
        -540
      ]
    }
  ],
  "connections": {
    "When clicking \u2018Test workflow\u2019": {
      "main": [
        [
          {
            "node": "HTTP Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request": {
      "main": [
        [
          {
            "node": "XML",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "XML": {
      "main": [
        [
          {
            "node": "Split Out",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items": {
      "main": [
        [],
        [
          {
            "node": "HTTP Request1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait": {
      "main": [
        [
          {
            "node": "HTTP Request2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request1": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP Request2": {
      "main": [
        [
          {
            "node": "If",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "If": {
      "main": [
        [
          {
            "node": "Supabase Vector Store",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Edit Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Default Data Loader": {
      "ai_document": [
        [
          {
            "node": "Supabase Vector Store",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Character Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Default Data Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings OpenAI": {
      "ai_embedding": [
        [
          {
            "node": "Supabase Vector Store",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "Edit Fields": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When chat message received": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Postgres Chat Memory": {
      "ai_memory": [
        [
          {
            "node": "AI Agent",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "Vector Store Tool": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings OpenAI1": {
      "ai_embedding": [
        [
          {
            "node": "Supabase Vector Store1",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model1": {
      "ai_languageModel": [
        [
          {
            "node": "Vector Store Tool",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Supabase Vector Store": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Supabase Vector Store1": {
      "ai_vectorStore": [
        [
          {
            "node": "Vector Store Tool",
            "type": "ai_vectorStore",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "6c3e7c6e-591f-43ad-88d0-05888cc30c4d",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "id": "9NoV8Zpk9uLd9pzG",
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Crawl4AI Agent. Uses httpRequest, xml, documentDefaultDataLoader, textSplitterCharacterTextSplitter. Event-driven trigger; 23 nodes.

Source: https://github.com/DPabloFlores/ottomator-agents/blob/main/crawl4AI-agent/n8n-version/Crawl4AI_Agent.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Your AI workforce is ready. Are you?

Google Sheets Tool, Mcp Trigger, Google Drive +29
AI & RAG

This intelligent chatbot leverages cutting-edge financial APIs and AI-driven analysis to deliver comprehensive stock research reports. Get instant access to professional-grade investment analysis that

Tool Think, Supabase Vector Store, OpenAI Embeddings +15
AI & RAG

Who is this for? This workflow is ideal for HR teams, startups, and enterprises that want to handle employee interactions through WhatsApp and automate responses using LLM (OpenAI) and intelligent rou

WhatsApp Trigger, OpenAI, OpenAI Chat +13
AI & RAG

Automate Outreach Prospect automates finding, enriching, and messaging potential partners (like restaurants, malls, and bars) using Apify Google Maps scraping, Perplexity enrichment, OpenAI LLMs, Goog

@Devlikeapro/N8N Nodes Waha, Google Drive Trigger, @Apify/N8N Nodes Apify +14
AI & RAG

Chat with docs - 5minAI New version. Uses httpRequest, documentDefaultDataLoader, textSplitterRecursiveCharacterTextSplitter, embeddingsOpenAi. Event-driven trigger; 62 nodes.

HTTP Request, Document Default Data Loader, Text Splitter Recursive Character Text Splitter +10