AutomationFlowsAI & RAG › Build a RAG Chat System Using Aryn Docparse, AWS S3, Pinecone and Gpt-4o

Build a RAG Chat System Using Aryn Docparse, AWS S3, Pinecone and Gpt-4o

ByAustin Lee @austin-aryn-ai on n8n.io

Provide your S3 bucket containing documents such as PDFs and MS Word in the "Get Files from S3" node. You will need to provide AWS credentials that will allow the node to access the bucket and download the files in the specified location. Choose document processing options in…

Event trigger★★★★☆ complexityAI-powered16 nodesAWS S3OpenAI EmbeddingsDocument Default Data LoaderText Splitter Recursive Character Text SplitterChat TriggerAgentOpenAI ChatPinecone Vector Store
AI & RAG Trigger: Event Nodes: 16 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #12531 — we link there as the canonical source.

This workflow follows the Agent → Chat Trigger recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "EThIwEtH9OYGVi4j_6V7p",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "ArynPineconeChatWithData",
  "tags": [
    {
      "id": "0ubyApGjX61dI60x",
      "name": "unstructured data",
      "createdAt": "2026-01-07T00:50:22.077Z",
      "updatedAt": "2026-01-07T00:50:22.077Z"
    },
    {
      "id": "b4VgVPwNG90ToWfc",
      "name": "property extraction",
      "createdAt": "2026-01-07T00:50:22.073Z",
      "updatedAt": "2026-01-07T00:50:22.073Z"
    }
  ],
  "nodes": [
    {
      "id": "44190ee7-f6a8-4f92-a53f-020b05b04b03",
      "name": "When clicking \u2018Test workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        0,
        0
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "049b2da3-3345-4d98-8d81-64660936ac30",
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        496,
        0
      ],
      "parameters": {
        "options": {},
        "batchSize": "={{ $json.Key.length }}"
      },
      "typeVersion": 3
    },
    {
      "id": "152dbf02-ca77-46e8-8de7-787c06ba998a",
      "name": "Download Files from AWS",
      "type": "n8n-nodes-base.awsS3",
      "position": [
        768,
        0
      ],
      "parameters": {
        "fileKey": "={{ $json.Key }}"
      },
      "credentials": {
        "aws": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2
    },
    {
      "id": "214a13b3-03a2-4731-9848-d9fc3b2bc757",
      "name": "Get Files from S3",
      "type": "n8n-nodes-base.awsS3",
      "position": [
        272,
        0
      ],
      "parameters": {
        "options": {
          "folderKey": ""
        },
        "operation": "getAll"
      },
      "credentials": {
        "aws": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2
    },
    {
      "id": "59a3755e-8594-4dc2-850c-45b4be4afde8",
      "name": "Embeddings OpenAI",
      "type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
      "position": [
        1168,
        160
      ],
      "parameters": {
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "6e29afe4-a0a7-42ea-8864-19efc321fec4",
      "name": "Default Data Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        1472,
        144
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "14f7f8b6-44ea-4299-ba69-1481c82c0d66",
      "name": "Recursive Character Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
      "position": [
        1328,
        160
      ],
      "parameters": {
        "options": {
          "splitCode": "python"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "80e86044-a70f-4266-9a43-ec0336a5f3a0",
      "name": "When chat message received",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "position": [
        272,
        352
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.1
    },
    {
      "id": "80ace102-6953-4d9e-8fa5-a9fe333ece52",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        480,
        352
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.9
    },
    {
      "id": "7517e47d-6339-4dbe-91a1-3f27a687bb24",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        480,
        544
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "3699d2be-77e0-49f3-85b6-1307d705881d",
      "name": "Pinecone Vector Store Tool",
      "type": "@n8n/n8n-nodes-langchain.vectorStorePinecone",
      "position": [
        832,
        336
      ],
      "parameters": {
        "mode": "retrieve-as-tool",
        "topK": 100,
        "options": {},
        "pineconeIndex": {
          "__rl": true,
          "mode": "list",
          "value": "n8n",
          "cachedResultName": "n8n"
        },
        "toolDescription": "Contains data about Pinecone releases."
      },
      "credentials": {
        "pineconeApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "9155bb41-4855-4099-9f24-267ac9ffdadd",
      "name": "Pinecone Vector Store",
      "type": "@n8n/n8n-nodes-langchain.vectorStorePinecone",
      "position": [
        1280,
        0
      ],
      "parameters": {
        "mode": "insert",
        "options": {},
        "pineconeIndex": {
          "__rl": true,
          "mode": "list",
          "value": "n8n",
          "cachedResultName": "n8n"
        }
      },
      "credentials": {
        "pineconeApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "95be5178-d276-4dd2-b429-6fb851163c22",
      "name": "Aryn",
      "type": "@aryn-ai/n8n-nodes-aryn.aryn",
      "position": [
        976,
        0
      ],
      "parameters": {},
      "credentials": {
        "arynApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 0
    },
    {
      "id": "45dfa5fd-79c5-4412-ad7a-409f1d7d82c7",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -64,
        -624
      ],
      "parameters": {
        "width": 912,
        "height": 528,
        "content": "## Build an awesome RAG system using Aryn DocParse to extract text, images, tables and properties from your documents in your ingestion pipeline\n \n### How it works\n1. Provide your S3 bucket containing documents such as PDFs and MS Word in the \"Get Files from S3\" node.  You will need to provide AWS credentials that will allow the node to access the bucket and download the files in the specified location.\n2. Choose document processing options in the Aryn node.  The main options are for text and table extraction.  You can also provide a JSON schema for property extraction.  You can refer to https://docs.aryn.ai/docparse/processing_options for details on these options.  You will also need an Aryn API key which you can obtain by going to https://aryn.ai/signup.  Please note that use of vision models for OCR and table extraction is restricted to paid tiers.\n3. The resulting content of parsing and extraction is then chunked and ingested into Pinecone.\n4. Once at least one document has been ingested into a Pinecone index, you can start asking questions about anything that may be found in ingested documents in the chat box.\n\n### Setup steps\n1. For data retrieval, you will need a \"folder\" in a bucket on AWS S3 as well as valid AWS credentials with permission to fetch those files.\n2. For document parsing, you will need to obtain an Aryn API key.  You can sign up for free at https://aryn.ai/signup.\n3. For the Pinecone vector database, head over to https://pinecone.io and create an account and create a sample index for free.  You will also need to generate an API key.\n4. For the AI agent and RAG, you will also need an OpenAI API key.  Please go to https://openai.com and get a free API key."
      },
      "typeVersion": 1
    },
    {
      "id": "567405c1-1299-445c-9816-07a0734ab4a5",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -80,
        -96
      ],
      "parameters": {
        "color": 7,
        "width": 1888,
        "height": 416,
        "content": "## Document parsing and ingestion\n"
      },
      "typeVersion": 1
    },
    {
      "id": "6bb15427-c7c6-4833-b974-5b869aa9b145",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        208,
        336
      ],
      "parameters": {
        "color": 7,
        "width": 912,
        "height": 336,
        "content": "## RAG\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "availableInMCP": false,
    "executionOrder": "v1"
  },
  "versionId": "3818110d-89fa-4f3b-924e-1664e2a3c2df",
  "connections": {
    "Aryn": {
      "main": [
        [
          {
            "node": "Pinecone Vector Store",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items": {
      "main": [
        [],
        [
          {
            "node": "Download Files from AWS",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings OpenAI": {
      "ai_embedding": [
        [
          {
            "node": "Pinecone Vector Store Tool",
            "type": "ai_embedding",
            "index": 0
          },
          {
            "node": "Pinecone Vector Store",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "Get Files from S3": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Default Data Loader": {
      "ai_document": [
        [
          {
            "node": "Pinecone Vector Store",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Download Files from AWS": {
      "main": [
        [
          {
            "node": "Aryn",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Pinecone Vector Store Tool": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "When chat message received": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Recursive Character Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Default Data Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \u2018Test workflow\u2019": {
      "main": [
        [
          {
            "node": "Get Files from S3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Provide your S3 bucket containing documents such as PDFs and MS Word in the "Get Files from S3" node. You will need to provide AWS credentials that will allow the node to access the bucket and download the files in the specified location. Choose document processing options in…

Source: https://n8n.io/workflows/12531/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

This n8n template automatically classifies incoming emails (Sales, Support, Internal, Finance, Promotions) and routes them to a dedicated OpenAI LLM Agent for processing. Depending on the category, th

OpenAI, Gmail, Text Classifier +16
AI & RAG

Automate Outreach Prospect automates finding, enriching, and messaging potential partners (like restaurants, malls, and bars) using Apify Google Maps scraping, Perplexity enrichment, OpenAI LLMs, Goog

@Devlikeapro/N8N Nodes Waha, Google Drive Trigger, @Apify/N8N Nodes Apify +14
AI & RAG

Chat with docs - 5minAI New version. Uses httpRequest, documentDefaultDataLoader, textSplitterRecursiveCharacterTextSplitter, embeddingsOpenAi. Event-driven trigger; 62 nodes.

HTTP Request, Document Default Data Loader, Text Splitter Recursive Character Text Splitter +10
AI & RAG

I prepared a detailed guide that illustrates the entire process of building an AI agent using Supabase and Google Drive within N8N workflows.

HTTP Request, Document Default Data Loader, Text Splitter Recursive Character Text Splitter +10
AI & RAG

RAG AI Agent Template V5. Uses lmChatOpenAi, documentDefaultDataLoader, embeddingsOpenAi, googleDrive. Event-driven trigger; 56 nodes.

OpenAI Chat, Document Default Data Loader, OpenAI Embeddings +12