This workflow corresponds to n8n.io template #13501 — we link there as the canonical source.
This workflow follows the Agent → Documentdefaultdataloader recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"id": "xdh3Ffr5asrTbRIQ",
"meta": {
"templateCredsSetupCompleted": true
},
"name": "Due Diligence Automation v2.0.1",
"tags": [],
"nodes": [
{
"id": "b3b60773-8f6e-4631-9e69-6862ed6e3009",
"name": "Retrieve Parsed Content",
"type": "n8n-nodes-base.httpRequest",
"position": [
896,
-160
],
"parameters": {
"url": "=https://api.cloud.llamaindex.ai/api/v1/parsing/job/{{ $json.id }}/result/markdown",
"options": {},
"sendHeaders": true,
"authentication": "genericCredentialType",
"genericAuthType": "httpHeaderAuth",
"headerParameters": {
"parameters": [
{
"name": "accept",
"value": "application/json"
}
]
}
},
"credentials": {
"httpHeaderAuth": {
"name": "<your credential>"
}
},
"typeVersion": 4.2
},
{
"id": "32b46e46-2f09-4106-8a8a-4961ae42bcaf",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1264,
-752
],
"parameters": {
"width": 1072,
"height": 416,
"content": "## Automated Due Diligence Report Generator\n\n### How it works\n1. Receive a POST webhook at /dd-ai with multipart due-diligence files; split files and create a unified deal ID.\n2. Check Pinecone for an existing namespace for the deal; if missing, parse files with LlamaIndex, generate embeddings, and insert them into Pinecone.\n3. Run multi-query retrieval from Pinecone and call the AI model (gpt-5-mini) to synthesize a structured due-diligence JSON: company profile, financials, risks, customer concentration, and investment thesis.\n4. Transform the structured output into HTML, render a PDF, upload it to S3, produce a public URL, and return the link to the caller.\n\n### Setup\n- [ ] Configure the webhook endpoint and set path to /dd-ai.\n- [ ] Add OpenAI API key and select model gpt-5-mini.\n- [ ] Connect Pinecone and ensure index name \"poc\" exists.\n- [ ] Add LlamaIndex parsing credentials.\n- [ ] Connect AWS S3 and create or select bucket \"poc\".\n- [ ] Test by POSTing multipart files with a filenames array and binary file fields."
},
"typeVersion": 1
},
{
"id": "3652d095-82b9-43e4-a55f-8d7e0c4eda13",
"name": "Receive Upload Request",
"type": "n8n-nodes-base.webhook",
"position": [
-1200,
-160
],
"parameters": {
"path": "5aed3279-e874-468b-91b6-902d99338f51",
"options": {},
"httpMethod": "POST",
"responseMode": "responseNode"
},
"typeVersion": 2.1
},
{
"id": "0f0e3174-43e8-4bd6-8059-be3953d584db",
"name": "Split Uploaded Files + Build Deal ID",
"type": "n8n-nodes-base.code",
"position": [
-976,
-160
],
"parameters": {
"jsCode": "// Split webhook binary files into separate items\n// Generate unified dealId from all filenames\n\nconst item = $input.first();\nconst body = item.json.body || {};\nconst binary = item.binary || {};\n\n// Parse filenames from body\nlet filenames = [];\ntry {\n filenames = JSON.parse(body.filenames || '[]');\n} catch (e) {\n // If not JSON, try to get from binary\n filenames = Object.values(binary).map(b => b.fileName);\n}\n\n// Generate unified deal ID from sorted filenames\nconst combinedNames = filenames.sort().join('|');\nconst dealId = Buffer.from(combinedNames).toString('base64').replace(/[^a-zA-Z0-9]/g, '').slice(0, 20);\n\n// Get all binary keys\nconst binaryKeys = Object.keys(binary).filter(k => k.startsWith('data'));\n\nif (binaryKeys.length === 0) {\n throw new Error('No binary files found in webhook request');\n}\n\n// Create one item per file\nconst items = binaryKeys.map((key, index) => {\n const binaryData = binary[key];\n const extension = (binaryData.fileExtension || '').toLowerCase();\n \n return {\n json: {\n dealId: dealId,\n sourceFile: binaryData.fileName,\n fileType: extension,\n mimeType: binaryData.mimeType,\n fileIndex: index,\n totalFiles: binaryKeys.length\n },\n binary: {\n data: binaryData\n }\n };\n});\n\nreturn items;"
},
"typeVersion": 2
},
{
"id": "e84d333f-dead-474c-800f-c8eae4d68326",
"name": "Iterate Files for Parsing",
"type": "n8n-nodes-base.splitInBatches",
"position": [
-96,
-144
],
"parameters": {
"options": {}
},
"typeVersion": 3
},
{
"id": "0b4c3de6-6724-4a96-ac2a-63796a83eb37",
"name": "Get Pinecone Index Stats",
"type": "n8n-nodes-base.httpRequest",
"position": [
-736,
-160
],
"parameters": {
"url": "=[credentials.pineconeApi.environment] /describe_index_stats",
"method": "POST",
"options": {},
"sendHeaders": true,
"headerParameters": {
"parameters": [
{
"name": "Content-Type",
"value": "application/json'"
}
]
}
},
"typeVersion": 4.3
},
{
"id": "6bf36498-9c1a-4ca6-a2bf-773d5555e432",
"name": "Upsert Chunks to Pinecone",
"type": "@n8n/n8n-nodes-langchain.vectorStorePinecone",
"position": [
1632,
-160
],
"parameters": {
"mode": "insert",
"options": {
"clearNamespace": false,
"pineconeNamespace": "={{ $('Iterate Files for Parsing').item.json.dealId }}"
},
"pineconeIndex": {
"__rl": true,
"mode": "list",
"value": "poc",
"cachedResultName": "poc"
}
},
"credentials": {
"pineconeApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.3
},
{
"id": "2fd24b26-f690-451e-b51a-9b4838ea8b89",
"name": "Generate Embeddings (Ingest)",
"type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
"position": [
1584,
-32
],
"parameters": {
"options": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.2
},
{
"id": "548bda77-79c1-422b-940e-3674835784fd",
"name": "Prepare Parsed Text Document",
"type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
"position": [
1792,
-32
],
"parameters": {
"options": {
"metadata": {
"metadataValues": [
{
"name": "deal_id",
"value": "={{ $json.dealId }}"
},
{
"name": "source_file",
"value": "={{ $json.sourceFile }}"
},
{
"name": "file_type",
"value": "={{ $json.fileType }}"
},
{
"name": "timestamp",
"value": "={{ $now.toUTC() }}"
}
]
}
}
},
"typeVersion": 1.1
},
{
"id": "0ec9a189-cc57-492c-841f-affd1d44a2c2",
"name": "Collect Ingested Deal IDs",
"type": "n8n-nodes-base.aggregate",
"position": [
32,
-576
],
"parameters": {
"options": {},
"fieldsToAggregate": {
"fieldToAggregate": [
{
"fieldToAggregate": "metadata.deal_id"
}
]
}
},
"typeVersion": 1
},
{
"id": "c5ee4aea-d116-4576-a4d3-903e84bd4d32",
"name": "Prepare Analysis Context",
"type": "n8n-nodes-base.code",
"position": [
272,
-576
],
"parameters": {
"jsCode": "// Handle both paths: from cache hit or from aggregated embeddings\n const items = $input.all();\n\n let dealId;\n\n // Check if coming from cache hit path\n if (items[0]?.json?.cacheHit === true) {\n dealId = items[0].json.dealId;\n } else {\n // Coming from aggregate embeddings path\n const dealIdArray = items[0]?.json?.deal_id;\n dealId = Array.isArray(dealIdArray) ? dealIdArray[0] : (dealIdArray || 'unknown');\n }\n\n return [{\n json: {\n dealId,\n filesProcessed: items[0]?.json?.vectorCount || (Array.isArray(items[0]?.json?.deal_id)\n ? items[0].json.deal_id.length : 1),\n fromCache: items[0]?.json?.cacheHit || false,\n timestamp: new Date().toISOString()\n }\n }];"
},
"typeVersion": 2
},
{
"id": "91d65143-09f7-4911-8fd0-f7caf7fdaa4c",
"name": "Run Due Diligence AI Analysis",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
512,
-576
],
"parameters": {
"text": "=You are a Senior Investment Analyst & Due Diligence Officer.\n\nMANDATORY RETRIEVAL STRATEGY:\nYou MUST make MULTIPLE Pinecone queries to gather ALL required data. Do\nNOT rely on a single query.\n\nREQUIRED QUERIES (execute ALL before generating output):\n 1. \"company name headquarters location employees industry overview\"\n 2. \"revenue financial performance FY2021 FY2022 FY2023 FY2024 FY2025 yearly results\"\n 3. \"gross margin net margin EBITDA margin profitability percentage\"\n 4. \"risk factors key risks challenges threats founder dependency labor market client concentration\"\n 5. \"customer concentration top clients revenue breakdown percentage\"\n 6. \"business model investment thesis value proposition growth strategy\"\n\nSTRICT RULES:\n - Execute ALL 6 queries before generating JSON output\n - Combine and synthesize evidence from ALL queries\n - Extract ALL years of financial data (2021-2025), not just recent year\n - Include ALL risks mentioned in documents (typically 5-7 risks)\n - For customer concentration, include specific percentages (Top 3, Top 5, Top 10)\n - If data is genuinely missing after all queries, use \"Not Available\" (string) or 0 (number)\n - Do not hallucinate - only use retrieved evidence\n\nOUTPUT FORMAT:\nReturn ONLY valid JSON matching the parser schema. No markdown, no explanation.\n\nOUTPUT TYPES:\n - String fields: use \"Not Available\" when missing\n - Numeric fields (employee_count, year, amount, ebitda): use 0 when missing\n - key_risks: MUST be array of strings with ALL identified risks\n - revenue_history: MUST include ALL available years (up to 5 years)",
"options": {},
"promptType": "define",
"hasOutputParser": true
},
"typeVersion": 3.1
},
{
"id": "7bf0b702-726b-4c79-99b1-9bb87ff82a88",
"name": " OpenAI Chat Model (5-mini)",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
"position": [
512,
-432
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "gpt-5-mini",
"cachedResultName": "gpt-5-mini"
},
"options": {},
"builtInTools": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.3
},
{
"id": "8a6592fe-8b2b-4f76-8027-38121ab2540c",
"name": "Parse Structured Analysis JSON",
"type": "@n8n/n8n-nodes-langchain.outputParserStructured",
"position": [
1024,
-464
],
"parameters": {
"jsonSchemaExample": "{\n \"company_profile\": {\n \"company_name\": \"Example Corp\",\n \"industry\": \"Manufacturing\",\n \"location\": \"Jakarta, Indonesia\",\n \"employee_count\": 120\n },\n \"financials\": {\n \"revenue_history\": [\n { \"year\": 2023, \"amount\": 1200000, \"currency\": \"USD\" },\n { \"year\": 2024, \"amount\": 1400000, \"currency\": \"USD\" },\n { \"year\": 2025, \"amount\": 1600000, \"currency\": \"USD\" }\n ],\n \"ebitda\": 250000,\n \"margins\": {\n \"gross_margin\": \"42%\",\n \"net_margin\": \"11%\",\n \"ebitda_margin\": \"18%\"\n }\n },\n \"analysis\": {\n \"business_model\": \"B2B recurring contracts\",\n \"investment_thesis\": \"Strong growth with expanding margins\",\n \"key_risks\": [\n \"Customer concentration\",\n \"FX exposure\"\n ],\n \"customer_concentration\": \"Top 3 customers contribute 55% revenue\"\n }\n}"
},
"typeVersion": 1.3
},
{
"id": "f43d2168-d066-4475-a9ed-3a782d53d9d5",
"name": "Retrieve Context from Pinecone",
"type": "@n8n/n8n-nodes-langchain.vectorStorePinecone",
"position": [
752,
-464
],
"parameters": {
"mode": "retrieve-as-tool",
"topK": 100,
"options": {
"pineconeNamespace": "={{ $json.dealId }}"
},
"pineconeIndex": {
"__rl": true,
"mode": "list",
"value": "poc",
"cachedResultName": "poc"
},
"toolDescription": "=Search the due diligence documents for specific information. You MUST call this tool MULTIPLE times with different focused queries:\n\nREQUIRED QUERIES:\n- Company profile: \"company name location headquarters employees industry\"\n- Financial history: \"revenue FY2021 FY2022 FY2023 FY2024 FY2025 financial performance\"\n- Margins: \"gross margin net margin EBITDA margin profitability\"\n- Risks: \"risk factors key risks founder dependency labor market concentration\"\n- Customer data: \"customer concentration top clients revenue percentage breakdown\"\n- Business analysis: \"business model investment thesis growth strategy\"\n\nEach query should focus on ONE category. Combine results from all queries for complete analysis."
},
"credentials": {
"pineconeApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.3
},
{
"id": "3f6e0076-4f85-417e-8f81-6d73d37012f7",
"name": " Generate Embeddings (Retrieval)",
"type": "@n8n/n8n-nodes-langchain.embeddingsOpenAi",
"position": [
656,
-432
],
"parameters": {
"options": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.2
},
{
"id": "e11c084d-36e6-4044-9de3-c318e6153131",
"name": "Map Analysis to Report Fields",
"type": "n8n-nodes-base.code",
"position": [
1168,
-576
],
"parameters": {
"jsCode": "// Transform AI Agent output to Due Diligence HTML template format\nconst raw = $input.first()?.json ?? {};\nconst data = raw.output && typeof raw.output === 'object' ? raw.output : raw;\n\nfunction isMissing(value) {\n return value === null || value === undefined || value === '' || (typeof value === 'string' && value.trim() === '');\n}\n\nfunction escapeHtml(value) {\n return String(value)\n .replace(/&/g, '&')\n .replace(/</g, '<')\n .replace(/>/g, '>')\n .replace(/\"/g, '"')\n .replace(/'/g, ''');\n}\n\nfunction textValue(value, fallback = 'Not Available') {\n return isMissing(value) ? fallback : String(value);\n}\n\nfunction numberValue(value, fallback = 'Not Available') {\n if (typeof value === 'number') {\n if (!Number.isFinite(value) || value === 0) return fallback;\n return value.toLocaleString('en-US');\n }\n\n if (typeof value === 'string') {\n const trimmed = value.trim();\n if (!trimmed || trimmed === '0') return fallback;\n return trimmed;\n }\n\n return fallback;\n}\n\nfunction generateRevenueRows(revenueHistory) {\n if (!Array.isArray(revenueHistory) || revenueHistory.length === 0) {\n return '<tr><td colspan=\"3\" class=\"na-value\">No revenue data available</td></tr>';\n }\n\n return revenueHistory.map((row) => {\n const year = textValue(row?.year, 'N/A');\n const amount = numberValue(row?.amount);\n const currency = textValue(row?.currency, 'USD');\n\n return `\n <tr>\n <td>${escapeHtml(year)}</td>\n <td>${escapeHtml(amount)}</td>\n <td>${escapeHtml(currency)}</td>\n </tr>\n `;\n }).join('');\n}\n\nfunction generateRiskItems(risks) {\n const cleanRisks = Array.isArray(risks)\n ? risks.filter((risk) => !isMissing(risk)).map((risk) => String(risk))\n : [];\n\n if (cleanRisks.length === 0) {\n return '<li class=\"risk-item\"><span class=\"risk-icon\">-</span><span class=\"risk-text\">No specific risks identified in the document</span></li>';\n }\n\n return cleanRisks.map((risk, index) => `\n <li class=\"risk-item\">\n <span class=\"risk-icon\">${index + 1}</span>\n <span class=\"risk-text\">${escapeHtml(risk)}</span>\n </li>\n `).join('');\n}\n\nconst companyProfile = data.company_profile || {};\nconst financials = data.financials || {};\nconst analysis = data.analysis || {};\nconst margins = financials.margins || {};\n\nconst dealId = $('Prepare Analysis Context').first()?.json?.dealId || 'N/A';\nconst reportDate = DateTime.now().toFormat(\"MMMM dd, yyyy 'at' HH:mm\");\n\nreturn [{\n json: {\n companyName: textValue(companyProfile.company_name),\n industry: textValue(companyProfile.industry),\n location: textValue(companyProfile.location),\n employeeCount: numberValue(companyProfile.employee_count),\n\n revenueTableRows: generateRevenueRows(financials.revenue_history),\n ebitda: numberValue(financials.ebitda),\n grossMargin: textValue(margins.gross_margin),\n netMargin: textValue(margins.net_margin),\n ebitdaMargin: textValue(margins.ebitda_margin),\n\n businessModel: textValue(analysis.business_model),\n investmentThesis: textValue(analysis.investment_thesis),\n riskListItems: generateRiskItems(analysis.key_risks),\n customerConcentration: textValue(analysis.customer_concentration),\n\n dealId: textValue(dealId, 'N/A'),\n reportDate,\n },\n}];"
},
"typeVersion": 2
},
{
"id": "baf80a35-7ac0-4ca2-ae29-378b5d71ed96",
"name": "Render DD Report HTML",
"type": "n8n-nodes-base.html",
"position": [
1472,
-576
],
"parameters": {
"html": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n <meta charset=\"UTF-8\">\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n <title>Due Diligence Report - {{ $json.companyName }}</title>\n <style>\n * { margin: 0; padding: 0; box-sizing: border-box; }\n body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Arial, sans-serif; background-color: #ffffff; color: #1e293b; line-height: 1.6; padding: 40px; max-width: 800px; margin: 0 auto; }\n .header { border-bottom: 3px solid #0f766e; padding-bottom: 24px; margin-bottom: 32px; }\n .header h1 { font-size: 28px; color: #0f766e; margin-bottom: 8px; }\n .header .subtitle { font-size: 14px; color: #64748b; }\n .header .date { font-size: 12px; color: #94a3b8; margin-top: 4px; }\n .section { margin-bottom: 32px; }\n .section-title { font-size: 18px; font-weight: 700; color: #0f766e; border-left: 4px solid #0f766e; padding-left: 12px; margin-bottom: 16px; }\n .card { background-color: #f8fafc; border: 1px solid #e2e8f0; border-radius: 8px; padding: 20px; margin-bottom: 16px; }\n .info-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; }\n .info-item { padding: 12px; background-color: #ffffff; border-radius: 6px; border: 1px solid #e2e8f0; }\n .info-label { font-size: 11px; text-transform: uppercase; color: #64748b; letter-spacing: 0.5px; margin-bottom: 4px; }\n .info-value { font-size: 16px; font-weight: 600; color: #1e293b; }\n table { width: 100%; border-collapse: collapse; margin-top: 12px; }\n th { background-color: #0f766e; color: #ffffff; font-size: 12px; font-weight: 600; text-align: left; padding: 12px; text-transform: uppercase; letter-spacing: 0.5px; }\n td { padding: 12px; border-bottom: 1px solid #e2e8f0; font-size: 14px; }\n tr:nth-child(even) { background-color: #f8fafc; }\n .metric-row { display: flex; justify-content: space-between; padding: 12px 0; border-bottom: 1px solid #e2e8f0; }\n .metric-row:last-child { border-bottom: none; }\n .metric-label { color: #475569; font-size: 14px; }\n .metric-value { font-weight: 600; color: #1e293b; }\n .text-content { font-size: 14px; color: #475569; line-height: 1.8; white-space: pre-line; }\n .risk-list { list-style: none; padding: 0; }\n .risk-item { display: flex; align-items: flex-start; padding: 12px; background-color: #fef2f2; border-left: 4px solid #ef4444; margin-bottom: 8px; border-radius: 0 6px 6px 0; }\n .risk-icon { width: 20px; height: 20px; background-color: #ef4444; color: #ffffff; border-radius: 50%; display: flex; align-items: center; justify-content: center; font-size: 12px; font-weight: 700; margin-right: 12px; flex-shrink: 0; }\n .risk-text { font-size: 14px; color: #7f1d1d; }\n .highlight-box { background: linear-gradient(135deg, #ecfdf5 0%, #d1fae5 100%); border: 1px solid #10b981; border-radius: 8px; padding: 20px; }\n .highlight-content { font-size: 14px; color: #047857; line-height: 1.7; }\n .footer { margin-top: 40px; padding-top: 20px; border-top: 1px solid #e2e8f0; text-align: center; font-size: 11px; color: #94a3b8; }\n .na-value { color: #94a3b8; font-style: italic; }\n </style>\n</head>\n<body>\n <div class=\"header\">\n <h1>Due Diligence Report</h1>\n <div class=\"subtitle\">{{ $json.companyName }}</div>\n <div class=\"date\">Generated: {{ $json.reportDate }}</div>\n </div>\n <div class=\"section\">\n <h2 class=\"section-title\">Company Overview</h2>\n <div class=\"card\">\n <div class=\"info-grid\">\n <div class=\"info-item\"><div class=\"info-label\">Company Name</div><div class=\"info-value\">{{ $json.companyName }}</div></div>\n <div class=\"info-item\"><div class=\"info-label\">Industry</div><div class=\"info-value\">{{ $json.industry }}</div></div>\n <div class=\"info-item\"><div class=\"info-label\">Location</div><div class=\"info-value\">{{ $json.location }}</div></div>\n <div class=\"info-item\"><div class=\"info-label\">Employee Count</div><div class=\"info-value\">{{ $json.employeeCount }}</div></div>\n </div>\n </div>\n </div>\n <div class=\"section\">\n <h2 class=\"section-title\">Financial Summary</h2>\n <div class=\"card\">\n <h3 style=\"font-size: 14px; color: #475569; margin-bottom: 12px;\">Revenue History</h3>\n <table><thead><tr><th>Year</th><th>Revenue</th><th>Currency</th></tr></thead><tbody>{{ $json.revenueTableRows }}</tbody></table>\n </div>\n <div class=\"card\">\n <h3 style=\"font-size: 14px; color: #475569; margin-bottom: 12px;\">Key Metrics</h3>\n <div class=\"metric-row\"><span class=\"metric-label\">EBITDA</span><span class=\"metric-value\">{{ $json.ebitda }}</span></div>\n <div class=\"metric-row\"><span class=\"metric-label\">Gross Margin</span><span class=\"metric-value\">{{ $json.grossMargin }}</span></div>\n <div class=\"metric-row\"><span class=\"metric-label\">Net Margin</span><span class=\"metric-value\">{{ $json.netMargin }}</span></div>\n <div class=\"metric-row\"><span class=\"metric-label\">EBITDA Margin</span><span class=\"metric-value\">{{ $json.ebitdaMargin }}</span></div>\n </div>\n </div>\n <div class=\"section\">\n <h2 class=\"section-title\">Business Model</h2>\n <div class=\"card\"><div class=\"text-content\">{{ $json.businessModel }}</div></div>\n </div>\n <div class=\"section\">\n <h2 class=\"section-title\">Investment Thesis</h2>\n <div class=\"highlight-box\"><div class=\"highlight-content\">{{ $json.investmentThesis }}</div></div>\n </div>\n <div class=\"section\">\n <h2 class=\"section-title\">Risk Analysis</h2>\n <div class=\"card\">\n <h3 style=\"font-size: 14px; color: #475569; margin-bottom: 12px;\">Key Risks</h3>\n <ul class=\"risk-list\">{{ $json.riskListItems }}</ul>\n </div>\n <div class=\"card\">\n <h3 style=\"font-size: 14px; color: #475569; margin-bottom: 12px;\">Customer Concentration</h3>\n <div class=\"text-content\">{{ $json.customerConcentration }}</div>\n </div>\n </div>\n <div class=\"footer\">\n <p>This report was generated automatically using AI-powered document analysis.</p>\n <p>Deal ID: {{ $json.dealId }}</p>\n </div>\n</body>\n</html>"
},
"typeVersion": 1.2
},
{
"id": "ce064d94-2c3f-48b9-bfc6-e353f42f8cbb",
"name": "Render PDF from HTML",
"type": "n8n-nodes-puppeteer.puppeteer",
"position": [
1696,
-576
],
"parameters": {
"options": {},
"operation": "runCustomScript",
"scriptCode": "=// Get HTML from previous node\nconst html = $json.html || '';\n\nif (!html) {\n throw new Error('HTML content is empty. Check Generate DD Report HTML output.');\n}\n\n// Render HTML with longer timeout\nawait $page.setContent(html, {\n waitUntil: 'networkidle0',\n timeout: 60000,\n});\n\n// Generate PDF\nconst pdfArray = await $page.pdf({\n format: 'A4',\n printBackground: true,\n margin: {\n top: '40px',\n right: '40px',\n bottom: '40px',\n left: '40px',\n },\n scale: 0.95,\n timeout: 60000,\n});\n\n// Return base64 for next node\nconst pdfBase64 = Buffer.from(pdfArray).toString('base64');\nreturn [{ json: { pdfBase64 } }];"
},
"typeVersion": 1
},
{
"id": "234bb525-e16f-429c-a5bc-39f749cfcfe8",
"name": "Convert PDF Base64 to Binary File",
"type": "n8n-nodes-base.convertToFile",
"position": [
1920,
-576
],
"parameters": {
"options": {
"fileName": "={{ $('Map Analysis to Report Fields').item.json.companyName }}-Analysis.pdf"
},
"operation": "toBinary",
"sourceProperty": "pdfBase64"
},
"typeVersion": 1.1
},
{
"id": "2f93d577-19ab-454a-ad8c-333eb8d18330",
"name": "Upload Report PDF to S3",
"type": "n8n-nodes-base.s3",
"position": [
2448,
-576
],
"parameters": {
"fileName": "={{ $json.fileName }}",
"operation": "upload",
"bucketName": "poc",
"additionalFields": {}
},
"credentials": {
"s3": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "42dbc003-e2da-4005-bac0-745404990b40",
"name": "Build Public Report URL",
"type": "n8n-nodes-base.code",
"position": [
2672,
-576
],
"parameters": {
"jsCode": "const baseUrl = 'https://poc.atlr.dev';\n const fileName = $('Prepare S3 File Metadata').first().json.fileName;\n const encodedFileName = encodeURIComponent(fileName);\n const publicUrl = `${baseUrl}/${encodedFileName}`;\n\n return {\n json: {\n success: true,\n fileName: fileName,\n publicUrl: publicUrl\n }\n };"
},
"typeVersion": 2
},
{
"id": "b0d10349-c402-4adf-a06e-af5dcbc2645b",
"name": " Merge Analysis + Report URL",
"type": "n8n-nodes-base.merge",
"position": [
2944,
-592
],
"parameters": {
"mode": "combine",
"options": {},
"combineBy": "combineByPosition"
},
"typeVersion": 3.2
},
{
"id": "909faa92-6fc8-4f00-8571-4630bd4de26d",
"name": "Is Parsing Job Complete?",
"type": "n8n-nodes-base.if",
"position": [
496,
-144
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 2,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "921ff875-817d-47fd-bd47-530ebdc21902",
"operator": {
"name": "filter.operator.equals",
"type": "string",
"operation": "equals"
},
"leftValue": "={{ $json.status }}",
"rightValue": "SUCCESS"
}
]
}
},
"typeVersion": 2.2
},
{
"id": "cc4c89d2-c586-48d1-a32f-b719e1f86beb",
"name": "Upload File to LlamaParse",
"type": "n8n-nodes-base.httpRequest",
"position": [
96,
-144
],
"parameters": {
"url": "https://api.cloud.llamaindex.ai/api/v1/parsing/upload",
"method": "POST",
"options": {},
"sendBody": true,
"contentType": "multipart-form-data",
"sendHeaders": true,
"authentication": "genericCredentialType",
"bodyParameters": {
"parameters": [
{
"name": "file",
"parameterType": "formBinaryData",
"inputDataFieldName": "data"
}
]
},
"genericAuthType": "httpHeaderAuth",
"headerParameters": {
"parameters": [
{
"name": "accept",
"value": "application/json"
}
]
}
},
"credentials": {
"httpHeaderAuth": {
"name": "<your credential>"
}
},
"typeVersion": 4.2
},
{
"id": "7341754f-86ea-4a5a-a967-880a711895c4",
"name": "Check LlamaParse Job Status",
"type": "n8n-nodes-base.httpRequest",
"position": [
304,
-144
],
"parameters": {
"url": "=https://api.cloud.llamaindex.ai/api/v1/parsing/job/{{ $json.id }}",
"options": {},
"sendHeaders": true,
"authentication": "genericCredentialType",
"genericAuthType": "httpHeaderAuth",
"headerParameters": {
"parameters": [
{
"name": "accept",
"value": "application/json"
}
]
}
},
"credentials": {
"httpHeaderAuth": {
"name": "<your credential>"
}
},
"typeVersion": 4.2
},
{
"id": "d5164ab7-159b-48dd-8d3a-3b8fa28415a6",
"name": "Wait 10s Before Recheck",
"type": "n8n-nodes-base.wait",
"position": [
688,
-128
],
"parameters": {
"amount": 10
},
"typeVersion": 1.1
},
{
"id": "bcec3773-43fe-475f-b2a0-a60b3f1ec2e8",
"name": "Return API Response",
"type": "n8n-nodes-base.respondToWebhook",
"position": [
3200,
-592
],
"parameters": {
"options": {}
},
"typeVersion": 1.5
},
{
"id": "7f640ee8-584f-4d97-b6af-e301070e309c",
"name": "Normalize Parsed Text Payload",
"type": "n8n-nodes-base.code",
"position": [
1136,
-160
],
"parameters": {
"jsCode": "const loopItem = $('Iterate Files for Parsing').item.json;\n const parsedContent = $json.markdown || $json.text || '';\n\n return [{\n json: {\n dealId: loopItem.dealId,\n sourceFile: loopItem.sourceFile,\n fileType: loopItem.fileType,\n parsedText: parsedContent\n }\n }];"
},
"typeVersion": 2
},
{
"id": "118de47b-e23b-490c-a0e7-cfcea140255c",
"name": " Check Deal Namespace Cache",
"type": "n8n-nodes-base.code",
"position": [
-528,
-160
],
"parameters": {
"jsCode": "const stats = $input.first().json;\n const originalItems = $('Split Uploaded Files + Build Deal ID').all();\n const dealId = originalItems[0]?.json?.dealId;\n\n // Check if namespace exists\n const namespaceExists = stats.namespaces && stats.namespaces[dealId];\n const vectorCount = namespaceExists ? stats.namespaces[dealId].vectorCount : 0;\n\n if (vectorCount > 0) {\n // Cache HIT - return single item for analysis path\n return [{\n json: {\n dealId: dealId,\n cacheHit: true,\n vectorCount: vectorCount,\n message: `Cache HIT: ${vectorCount} vectors found`\n }\n }];\n } else {\n // Cache MISS - return ALL original items WITH binary\n return originalItems.map(item => ({\n json: {\n ...item.json,\n cacheHit: false\n },\n binary: item.binary\n }));\n }"
},
"typeVersion": 2
},
{
"id": "7a292f76-9126-4db2-8eed-6410ecf15e94",
"name": "Cache Hit?",
"type": "n8n-nodes-base.if",
"position": [
-352,
-160
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 3,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "a45d92c0-a88f-49fc-b393-b482a3585d21",
"operator": {
"type": "boolean",
"operation": "equals"
},
"leftValue": "={{ $json.cacheHit }}",
"rightValue": true
}
]
}
},
"typeVersion": 2.3
},
{
"id": "1e26091a-1b9d-4f6a-ac36-fdd96a362351",
"name": "Prepare S3 File Metadata",
"type": "n8n-nodes-base.code",
"position": [
2256,
-576
],
"parameters": {
"jsCode": "const companyName = $('Map Analysis to Report Fields').first().json.companyName;\n const timestamp = Date.now();\n const fileName = `${companyName}-assessment-${timestamp}.pdf`;\n\n return [{\n json: {\n fileName: fileName\n },\n binary: $input.first().binary\n }];"
},
"typeVersion": 2
},
{
"id": "bc94aed6-353b-4203-9cd8-f8c45ac858ca",
"name": "Sticky Note1",
"type": "n8n-nodes-base.stickyNote",
"position": [
-1264,
-320
],
"parameters": {
"color": 7,
"width": 448,
"height": 336,
"content": "## Intake & Request Normalization \nReceives multipart upload, parses filenames/binaries, and creates one normalized item per file with a shared dealId.\n"
},
"typeVersion": 1
},
{
"id": "cfeb2a7e-4863-4462-9de7-111907e8e31a",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
-800,
-320
],
"parameters": {
"color": 7,
"width": 608,
"height": 336,
"content": "## Pinecone Cache Check\nChecks namespace stats in Pinecone to detect cache hit/miss and routes flow to direct analysis or document parsing.\n"
},
"typeVersion": 1
},
{
"id": "6e2eadb0-3f1b-4983-bd0d-18cfd9ade098",
"name": "Sticky Note3",
"type": "n8n-nodes-base.stickyNote",
"position": [
-176,
-272
],
"parameters": {
"color": 7,
"width": 1536,
"height": 432,
"content": " ## Document Parsing Loop\nUploads each file for parsing, polls async status until complete, fetches markdown output, and normalizes parsed text payload."
},
"typeVersion": 1
},
{
"id": "5df3e070-5fe6-4b67-98fa-4766a68fb83b",
"name": "Sticky Note4",
"type": "n8n-nodes-base.stickyNote",
"position": [
1376,
-272
],
"parameters": {
"color": 7,
"width": 672,
"height": 432,
"content": " ## Vector Ingestion \nConverts parsed content into documents, generates embeddings, upserts vectors into deal namespace, and aggregates ingestion results.\n"
},
"typeVersion": 1
},
{
"id": "866571b6-5fd2-455c-8712-6f084b38c8b3",
"name": "Sticky Note5",
"type": "n8n-nodes-base.stickyNote",
"position": [
-176,
-752
],
"parameters": {
"color": 7,
"width": 1536,
"height": 464,
"content": "## AI Due Diligence Analysis\nBuilds analysis context, retrieves supporting evidence from Pinecone, runs LLM agent, and enforces structured JSON output."
},
"typeVersion": 1
},
{
"id": "b2be995c-d818-4986-8a82-e4e4a43e80de",
"name": "Sticky Note6",
"type": "n8n-nodes-base.stickyNote",
"position": [
1376,
-752
],
"parameters": {
"color": 7,
"width": 784,
"height": 464,
"content": "## Report Rendering (HTML to PDF) \n Maps AI output to report fields, renders HTML template, generates PDF, and converts output to binary file."
},
"typeVersion": 1
},
{
"id": "6451d4ea-9229-4a6a-980b-b92204d2c02d",
"name": "Sticky Note7",
"type": "n8n-nodes-base.stickyNote",
"position": [
2176,
-752
],
"parameters": {
"color": 7,
"width": 1200,
"height": 464,
"content": "## Delivery & Webhook Response\nPrepares filename metadata, uploads PDF to S3, builds public URL, merges outputs, and returns final API response."
},
"typeVersion": 1
}
],
"active": true,
"settings": {
"callerPolicy": "workflowsFromSameOwner",
"availableInMCP": false,
"executionOrder": "v1"
},
"versionId": "f7682e18-ad85-4a3e-92c9-bb97bd2d1472",
"connections": {
"Cache Hit?": {
"main": [
[
{
"node": "Prepare Analysis Context",
"type": "main",
"index": 0
}
],
[
{
"node": "Iterate Files for Parsing",
"type": "main",
"index": 0
}
]
]
},
"Render PDF from HTML": {
"main": [
[
{
"node": "Convert PDF Base64 to Binary File",
"type": "main",
"index": 0
}
]
]
},
"Render DD Report HTML": {
"main": [
[
{
"node": "Render PDF from HTML",
"type": "main",
"index": 0
}
]
]
},
"Receive Upload Request": {
"main": [
[
{
"node": "Split Uploaded Files + Build Deal ID",
"type": "main",
"index": 0
}
]
]
},
"Build Public Report URL": {
"main": [
[
{
"node": " Merge Analysis + Report URL",
"type": "main",
"index": 0
}
]
]
},
"Retrieve Parsed Content": {
"main": [
[
{
"node": "Normalize Parsed Text Payload",
"type": "main",
"index": 0
}
]
]
},
"Upload Report PDF to S3": {
"main": [
[
{
"node": "Build Public Report URL",
"type": "main",
"index": 0
}
]
]
},
"Wait 10s Before Recheck": {
"main": [
[
{
"node": "Check LlamaParse Job Status",
"type": "main",
"index": 0
}
]
]
},
"Get Pinecone Index Stats": {
"main": [
[
{
"node": " Check Deal Namespace Cache",
"type": "main",
"index": 0
}
]
]
},
"Is Parsing Job Complete?": {
"main": [
[
{
"node": "Retrieve Parsed Content",
"type": "main",
"index": 0
}
],
[
{
"node": "Wait 10s Before Recheck",
"type": "main",
"index": 0
}
]
]
},
"Prepare Analysis Context": {
"main": [
[
{
"node": "Run Due Diligence AI Analysis",
"type": "main",
"index": 0
}
]
]
},
"Prepare S3 File Metadata": {
"main": [
[
{
"node": "Upload Report PDF to S3",
"type": "main",
"index": 0
}
]
]
},
"Collect Ingested Deal IDs": {
"main": [
[
{
"node": "Prepare Analysis Context",
"type": "main",
"index": 0
}
]
]
},
"Iterate Files for Parsing": {
"main": [
[
{
"node": "Collect Ingested Deal IDs",
"type": "main",
"index": 0
}
],
[
{
"node": "Upload File to LlamaParse",
"type": "main",
"index": 0
}
]
]
},
"Upload File to LlamaParse": {
"main": [
[
{
"node": "Check LlamaParse Job Status",
"type": "main",
"index": 0
}
]
]
},
"Upsert Chunks to Pinecone": {
"main": [
[
{
"node": "Iterate Files for Parsing",
"type": "main",
"index": 0
}
]
]
},
" Check Deal Namespace Cache": {
"main": [
[
{
"node": "Cache Hit?",
"type": "main",
"index": 0
}
]
]
},
" OpenAI Chat Model (5-mini)": {
"ai_languageModel": [
[
{
"node": "Run Due Diligence AI Analysis",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Check LlamaParse Job Status": {
"main": [
[
{
"node": "Is Parsing Job Complete?",
"type": "main",
"index": 0
}
]
]
},
" Merge Analysis + Report URL": {
"main": [
[
{
"node": "Return API Response",
"type": "main",
"index": 0
}
]
]
},
"Generate Embeddings (Ingest)": {
"ai_embedding": [
[
{
"node": "Upsert Chunks to Pinecone",
"type": "ai_embedding",
"index": 0
}
]
]
},
"Prepare Parsed Text Document": {
"ai_document": [
[
{
"node": "Upsert Chunks to Pinecone",
"type": "ai_document",
"index": 0
}
]
]
},
"Map Analysis to Report Fields": {
"main": [
[
{
"node": "Render DD Report HTML",
"type": "main",
"index": 0
}
]
]
},
"Normalize Parsed Text Payload": {
"main": [
[
{
"node": "Upsert Chunks to Pinecone",
"type": "main",
"index": 0
}
]
]
},
"Run Due Diligence AI Analysis": {
"main": [
[
{
"node": "Map Analysis to Report Fields",
"type": "main",
"index": 0
},
{
"node": " Merge Analysis + Report URL",
"type": "main",
"index": 1
}
]
]
},
"Parse Structured Analysis JSON": {
"ai_outputParser": [
[
{
"node": "Run Due Diligence AI Analysis",
"type": "ai_outputParser",
"index": 0
}
]
]
},
"Retrieve Context from Pinecone": {
"ai_tool": [
[
{
"node": "Run Due Diligence AI Analysis",
"type": "ai_tool",
"index": 0
}
]
]
},
" Generate Embeddings (Retrieval)": {
"ai_embedding": [
[
{
"node": "Retrieve Context from Pinecone",
"type": "ai_embedding",
"index": 0
}
]
]
},
"Convert PDF Base64 to Binary File": {
"main": [
[
{
"node": "Prepare S3 File Metadata",
"type": "main",
"index": 0
}
]
]
},
"Split Uploaded Files + Build Deal ID": {
"main": [
[
{
"node": "Get Pinecone Index Stats",
"type": "main",
"index": 0
}
]
]
}
}
}
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
httpHeaderAuthopenAiApipineconeApis3
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
Streamline M&A due diligence with AI. This n8n workflow automatically parses financial documents using LlamaIndex, embeds data into Pinecone, and generates comprehensive, AI-driven reports with GPT-5-mini, saving hours of manual review and ensuring consistent, data-backed…
Source: https://n8n.io/workflows/13501/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
Turn unstructured pitch decks and investment memos into polished Due Diligence PDF reports automatically. This n8n workflow handles everything from document ingestion to final delivery, combining inte
Transform raw investment memorandums and financial decks into comprehensive, professional Due Diligence (DD) PDF reports. This workflow automates document parsing via LlamaParse, enriches internal dat
This Workflow simulates an AI-powered phone agent with RetellAI with two main functions: 📅 Appointment Booking – It can schedule appointments directly into Google Calendar. 🧠 RAG-based Information Ret
AI Phone Agent with RetellAI. Uses lmChatOpenAi, outputParserStructured, vectorStoreQdrant, embeddingsOpenAi. Webhook trigger; 36 nodes.
Indoor Farming Agent. Uses lmChatOpenAi, documentDefaultDataLoader, embeddingsOpenAi, toolVectorStore. Webhook trigger; 36 nodes.