AutomationFlowsAI & RAG › OpenAI RAG System with Document Upload and Semantic Search

OpenAI RAG System with Document Upload and Semantic Search

Original n8n title: Build an Openai RAG System with Document Upload, Semantic Search and Caching

ByResilNext @rnair1996 on n8n.io

This workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.

Webhook trigger★★★★★ complexityAI-powered33 nodesText Splitter Recursive Character Text SplitterDocument Default Data LoaderOpenAI EmbeddingsVector Store PgvectorPostgresOpenAI ChatAgentTool Vector Store
AI & RAG Trigger: Webhook Nodes: 33 Complexity: ★★★★★ AI nodes: yes Added:

This workflow corresponds to n8n.io template #14827 — we link there as the canonical source.

This workflow follows the Agent → Documentdefaultdataloader recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "nodes": [
    {
      "id": "772b7ee1-fa79-424e-91ee-b042a7c098c7",
      "name": "Webhook Trigger",
      "type": "n8n-nodes-base.webhook",
      "position": [
        -1392,
        128
      ],
      "parameters": {
        "path": "rag-system",
        "options": {},
        "httpMethod": "POST",
        "responseMode": "lastNode"
      },
      "typeVersion": 2.1
    },
    {
      "id": "f254f90f-fc8b-4e69-b680-2faa6ea14a2b",
      "name": "Workflow Configuration",
      "type": "n8n-nodes-base.set",
      "position": [
        -1120,
        128
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "chunkSize",
              "type": "number",
              "value": 1000
            },
            {
              "id": "id-2",
              "name": "chunkOverlap",
              "type": "number",
              "value": 200
            },
            {
              "id": "id-3",
              "name": "topK",
              "type": "number",
              "value": 5
            },
            {
              "id": "id-4",
              "name": "tableName",
              "type": "string",
              "value": "documents"
            },
            {
              "id": "id-5",
              "name": "cacheTableName",
              "type": "string",
              "value": "query_cache"
            }
          ]
        },
        "includeOtherFields": true
      },
      "typeVersion": 3.4
    },
    {
      "id": "de7dfd1a-89f5-4579-8723-a6025a788f51",
      "name": "Route by Action",
      "type": "n8n-nodes-base.switch",
      "position": [
        -736,
        128
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "outputKey": "Upload",
              "conditions": {
                "options": {
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.body.action }}",
                    "rightValue": "upload"
                  }
                ]
              },
              "renameOutput": true
            },
            {
              "outputKey": "Query",
              "conditions": {
                "options": {
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "operator": {
                      "type": "string",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.body.action }}",
                    "rightValue": "query"
                  }
                ]
              },
              "renameOutput": true
            }
          ]
        },
        "options": {
          "fallbackOutput": "extra"
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "04c7dba5-11fe-4320-a72c-0dde1ec79bbd",
      "name": "Extract Text from Document",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        -448,
        752
      ],
      "parameters": {
        "options": {},
        "operation": "pdf"
      },
      "typeVersion": 1.1
    },
    {
      "id": "3cc951b4-1db9-44f5-bc0e-94dba28aae25",
      "name": "Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
      "position": [
        -96,
        1184
      ],
      "parameters": {
        "options": {},
        "chunkSize": "={{ $('Workflow Configuration').first().json.chunkSize }}",
        "chunkOverlap": "={{ $('Workflow Configuration').first().json.chunkOverlap }}"
      },
      "typeVersion": 1
    },
    {
      "id": "8b938e6a-5a85-4271-aa1f-260802704c93",
      "name": "Document Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        -160,
        1024
      ],
      "parameters": {
        "options": {},
        "jsonData": "={{ $json.text }}",
        "jsonMode": "expressionData",
        "textSplittingMode": "custom"
      },
      "typeVersion": 1.1
    },
    {
      "id": "2c678aa9-751c-47fd-a5f5-5c85e40a5b9e",
      "name": "OpenAI Embeddings",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "position": [
        192,
        1056
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.2
    },
    {
      "id": "c0432b95-ccf1-4cdd-995e-7cb3af38437d",
      "name": "Store Embeddings in PGVector",
      "type": "@n8n/n8n-nodes-langchain.vectorStorePGVector",
      "position": [
        -112,
        752
      ],
      "parameters": {
        "mode": "insert",
        "options": {},
        "tableName": "={{ $('Workflow Configuration').first().json.tableName }}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "92c79b58-e8c9-4496-be91-fe7d0983c16c",
      "name": "Log Upload to Cache",
      "type": "n8n-nodes-base.postgres",
      "position": [
        528,
        752
      ],
      "parameters": {
        "query": "INSERT INTO upload_log (user_id, document_name, uploaded_at) VALUES ($1, $2, $3)",
        "options": {
          "queryReplacement": "={{ $('Webhook Trigger').first().json.body.user_id }},={{ $('Webhook Trigger').first().json.body.document_name }},={{ $now.toISO() }}"
        },
        "operation": "executeQuery"
      },
      "typeVersion": 2.6
    },
    {
      "id": "c5d4dcaa-0f64-44ee-93b9-6d6994a62397",
      "name": "Respond Upload Success",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        848,
        752
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={\n  \"status\": \"success\",\n  \"message\": \"Document uploaded and processed\",\n  \"user_id\": \"={{ $('Webhook Trigger').first().json.body.user_id }}\",\n  \"document_name\": \"={{ $('Webhook Trigger').first().json.body.document_name }}\"\n}"
      },
      "typeVersion": 1.5
    },
    {
      "id": "d7e575a7-e112-4848-8521-c24b4adc50a6",
      "name": "Check Query Cache",
      "type": "n8n-nodes-base.postgres",
      "position": [
        -384,
        -64
      ],
      "parameters": {
        "query": "SELECT answer, created_at FROM query_cache WHERE user_id = $1 AND query_hash = MD5($2) AND created_at > NOW() - INTERVAL '1 hour'",
        "options": {
          "queryReplacement": "={{ $('Webhook Trigger').first().json.body.user_id }},={{ $('Webhook Trigger').first().json.body.query }}"
        },
        "operation": "executeQuery"
      },
      "typeVersion": 2.6
    },
    {
      "id": "a6cd4561-8ba5-43a6-a679-d6259f64078c",
      "name": "Cache Hit or Miss",
      "type": "n8n-nodes-base.switch",
      "position": [
        -112,
        -64
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "outputKey": "Cache Hit",
              "conditions": {
                "options": {
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "operator": {
                      "type": "number",
                      "operation": "gt"
                    },
                    "leftValue": "={{ $json.length }}",
                    "rightValue": 0
                  }
                ]
              },
              "renameOutput": true
            },
            {
              "outputKey": "Cache Miss",
              "conditions": {
                "options": {
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "strict"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "operator": {
                      "type": "number",
                      "operation": "equals"
                    },
                    "leftValue": "={{ $json.length }}",
                    "rightValue": 0
                  }
                ]
              },
              "renameOutput": true
            }
          ]
        },
        "options": {}
      },
      "typeVersion": 3.4
    },
    {
      "id": "1576da8b-432c-42c9-a0e4-487523bf314b",
      "name": "Retrieve Relevant Chunks",
      "type": "@n8n/n8n-nodes-langchain.vectorStorePGVector",
      "position": [
        432,
        1088
      ],
      "parameters": {
        "options": {
          "metadata": {
            "metadataValues": [
              {
                "name": "user_id",
                "value": "={{ $('Webhook Trigger').first().json.body.user_id }}"
              }
            ]
          }
        },
        "tableName": "={{ $('Workflow Configuration').first().json.tableName }}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "559a79ea-ce0a-4fa1-96e6-f9439799a48a",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        224,
        240
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini"
        },
        "options": {},
        "builtInTools": {}
      },
      "typeVersion": 1.3
    },
    {
      "id": "4952c071-c20b-4e4e-b753-e0fe93413d26",
      "name": "Answer Query with Context",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        288,
        16
      ],
      "parameters": {
        "text": "={{ $('Webhook Trigger').first().json.body.query }}",
        "options": {
          "systemMessage": "You are a helpful AI assistant that answers questions based on the provided document context.\n\nYour task is to:\n1. Analyze the user's question carefully\n2. Review the relevant document chunks provided from the vector database\n3. Provide a clear, accurate answer based ONLY on the information in the documents\n4. If the answer is not in the provided context, say \"I don't have enough information to answer that question based on the available documents.\"\n5. Always cite which document or section your answer comes from when possible\n\nBe concise but thorough. Maintain a professional and helpful tone."
        },
        "promptType": "define"
      },
      "typeVersion": 3
    },
    {
      "id": "a7bea884-a935-4de3-8770-a52806d1abd4",
      "name": "Save to Query Cache",
      "type": "n8n-nodes-base.postgres",
      "position": [
        768,
        16
      ],
      "parameters": {
        "query": "INSERT INTO query_cache (user_id, query_hash, query, answer, created_at) VALUES ($1, MD5($2), $3, $4, NOW()) ON CONFLICT (user_id, query_hash) DO UPDATE SET answer = EXCLUDED.answer, created_at = NOW()",
        "options": {
          "queryReplacement": "={{ $('Webhook Trigger').first().json.body.user_id }},={{ $('Webhook Trigger').first().json.body.query }},={{ $('Webhook Trigger').first().json.body.query }},={{ $json.output }}"
        },
        "operation": "executeQuery"
      },
      "typeVersion": 2.6
    },
    {
      "id": "c380d8d7-769c-4bfe-994e-d54dffcc4c7a",
      "name": "Respond with Answer",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        1168,
        16
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={\n  \"answer\": \"={{ $('Answer Query with Context').first().json.output }}\",\n  \"cached\": false,\n  \"user_id\": \"={{ $('Webhook Trigger').first().json.body.user_id }}\"\n}"
      },
      "typeVersion": 1.5
    },
    {
      "id": "efcc43e4-7a1d-4174-85dd-f73c55960ba6",
      "name": "Format Cached Response",
      "type": "n8n-nodes-base.set",
      "position": [
        416,
        -576
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "id-1",
              "name": "answer",
              "type": "string",
              "value": "={{ $json.answer }}"
            },
            {
              "id": "id-2",
              "name": "cached",
              "type": "boolean",
              "value": true
            },
            {
              "id": "id-3",
              "name": "user_id",
              "type": "string",
              "value": "={{ $('Webhook Trigger').first().json.body.user_id }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "767873ee-0740-4460-8d1a-79a9b37e1d8b",
      "name": "Respond with Cached Answer",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        880,
        -576
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={\n  \"answer\": \"={{ $json.answer }}\",\n  \"cached\": true,\n  \"user_id\": \"={{ $json.user_id }}\"\n}"
      },
      "typeVersion": 1.5
    },
    {
      "id": "4f8320ed-a474-4f49-b6d6-baf644f3c937",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1472,
        48
      ],
      "parameters": {
        "color": 7,
        "width": 592,
        "height": 240,
        "content": "## Input Layer\nReceive upload or query via webhook and Set chunking, topK, and database tables"
      },
      "typeVersion": 1
    },
    {
      "id": "0aafeb7b-06f6-45c9-bfe8-322e95550240",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -800,
        16
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 288,
        "content": "## Action Routing\nRoute request to upload or query flow"
      },
      "typeVersion": 1
    },
    {
      "id": "34da2fb8-619f-4e21-b2d1-4b7e0192f2ae",
      "name": "Answer questions with a vector store",
      "type": "@n8n/n8n-nodes-langchain.toolVectorStore",
      "position": [
        368,
        240
      ],
      "parameters": {},
      "typeVersion": 1.1
    },
    {
      "id": "479de263-883a-4197-9ae2-09ed26b05b5c",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -448,
        -192
      ],
      "parameters": {
        "color": 7,
        "width": 470,
        "height": 288,
        "content": "## Cache Check\nCheck if query result exists in cache"
      },
      "typeVersion": 1
    },
    {
      "id": "afface0a-2435-4831-aee3-0c5683218a53",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        336,
        -704
      ],
      "parameters": {
        "color": 7,
        "width": 272,
        "height": 320,
        "content": "## Cache Response Formatting\nFormat cached query results into a consistent response structure."
      },
      "typeVersion": 1
    },
    {
      "id": "2fe09f60-1e2b-42fd-9115-34af03fcfee3",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        768,
        -736
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 320,
        "content": "## Response Layer\nReturn answer or cached result via webhook"
      },
      "typeVersion": 1
    },
    {
      "id": "0adeaf21-cd46-4e59-96a7-2285c91d79a9",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1056,
        -64
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 240,
        "content": "## Response Layer\nReturn answer or cached result via webhook"
      },
      "typeVersion": 1
    },
    {
      "id": "9efa23a4-2f27-4b7b-b729-c9a0c02194ce",
      "name": "Sticky Note7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        656,
        -64
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 272,
        "content": "## Cache Storage\nStore query and response for reuse"
      },
      "typeVersion": 1
    },
    {
      "id": "75a4c942-f12b-42ae-96a1-6a36542acf80",
      "name": "Sticky Note8",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        176,
        -80
      ],
      "parameters": {
        "color": 7,
        "width": 464,
        "height": 448,
        "content": "## Answer Generation\nGenerate answer using context + AI model"
      },
      "typeVersion": 1
    },
    {
      "id": "02e378e4-dd93-4800-ae87-1d14490b2727",
      "name": "Sticky Note9",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -512,
        672
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 240,
        "content": "## Document Processing\nExtract text from uploaded documents"
      },
      "typeVersion": 1
    },
    {
      "id": "f21786fe-b403-4aa0-b484-caa8bcd67e64",
      "name": "Sticky Note10",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -160,
        640
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 272,
        "content": "## Chunking Engine\nSplit text into overlapping chunks"
      },
      "typeVersion": 1
    },
    {
      "id": "6a5abc30-754c-4e9b-8213-f79186b80a35",
      "name": "Sticky Note11",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        400,
        976
      ],
      "parameters": {
        "color": 7,
        "width": 304,
        "height": 240,
        "content": "## Embeddings Storage\nGenerate embeddings and store in PGVector"
      },
      "typeVersion": 1
    },
    {
      "id": "7a1009b0-7395-4041-bde5-06d382318b30",
      "name": "Sticky Note12",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        368,
        624
      ],
      "parameters": {
        "color": 7,
        "width": 672,
        "height": 272,
        "content": "## Upload Logging\nTrack document uploads in database"
      },
      "typeVersion": 1
    },
    {
      "id": "32a6aac8-009a-4cac-829b-ca853c9f2f38",
      "name": "Sticky Note13",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2224,
        -256
      ],
      "parameters": {
        "width": 448,
        "height": 576,
        "content": "## How it works\nThis workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.\n\nUsers can upload documents or send queries via webhook. Uploaded files are processed by extracting text, splitting it into chunks, generating embeddings, and storing them in a vector database (PGVector).\n\nWhen a query is received, the workflow first checks a cache for recent answers. If no cache is found, it retrieves relevant document chunks using vector search and generates a contextual answer using an AI model.\n\nResponses are cached for faster future queries and returned to the user via webhook.\n\n## Setup steps\n- Configure webhook endpoint for upload and query actions\n- Add OpenAI API credentials for embeddings and chat\n- Set up Postgres with PGVector extension\n- Create tables for documents and query cache\n- Adjust chunk size, overlap, and topK settings"
      },
      "typeVersion": 1
    }
  ],
  "connections": {
    "Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Document Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    },
    "Document Loader": {
      "ai_document": [
        [
          {
            "node": "Store Embeddings in PGVector",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Route by Action": {
      "main": [
        [
          {
            "node": "Extract Text from Document",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Check Query Cache",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Webhook Trigger": {
      "main": [
        [
          {
            "node": "Workflow Configuration",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Cache Hit or Miss": {
      "main": [
        [
          {
            "node": "Format Cached Response",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Answer Query with Context",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Query Cache": {
      "main": [
        [
          {
            "node": "Cache Hit or Miss",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Answer Query with Context",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Embeddings": {
      "ai_embedding": [
        [
          {
            "node": "Store Embeddings in PGVector",
            "type": "ai_embedding",
            "index": 0
          },
          {
            "node": "Retrieve Relevant Chunks",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "Log Upload to Cache": {
      "main": [
        [
          {
            "node": "Respond Upload Success",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Save to Query Cache": {
      "main": [
        [
          {
            "node": "Respond with Answer",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format Cached Response": {
      "main": [
        [
          {
            "node": "Respond with Cached Answer",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Workflow Configuration": {
      "main": [
        [
          {
            "node": "Route by Action",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Retrieve Relevant Chunks": {
      "ai_vectorStore": [
        [
          {
            "node": "Answer questions with a vector store",
            "type": "ai_vectorStore",
            "index": 0
          }
        ]
      ]
    },
    "Answer Query with Context": {
      "main": [
        [
          {
            "node": "Save to Query Cache",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Text from Document": {
      "main": [
        [
          {
            "node": "Store Embeddings in PGVector",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Store Embeddings in PGVector": {
      "main": [
        [
          {
            "node": "Log Upload to Cache",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Answer questions with a vector store": {
      "ai_tool": [
        [
          {
            "node": "Answer Query with Context",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow implements a complete Retrieval-Augmented Generation (RAG) system for document ingestion and intelligent querying.

Source: https://n8n.io/workflows/14827/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Hi! I’m Amanda, a creator of intelligent automations using n8n and Make. I’ve been building AI-powered workflows for over 2 years, always focused on usability and innovation. This one here is very spe

OpenAI Chat, Redis, OpenAI +11
AI & RAG

AI Multi-Document Analyzer with Smart Recommendations & Reporting

Crypto, Agent, OpenAI Chat +8
AI & RAG

V3 Local Agentic RAG AI Agent. Uses documentDefaultDataLoader, memoryPostgresChat, chatTrigger, agent. Webhook trigger; 41 nodes.

Document Default Data Loader, Memory Postgres Chat, Chat Trigger +9
AI & RAG

Author: Jadai kongolo

Document Default Data Loader, Memory Postgres Chat, Chat Trigger +9
AI & RAG

This workflow implements a complete Voice AI Chatbot system for Wordress that integrates speech recognition, guardrails for safety, retrieval-augmented generation (RAG), Qdrant vector search, and audi

OpenAI Chat, Memory Buffer Window, Tool Calculator +10