This workflow corresponds to n8n.io template #11232 — we link there as the canonical source.
This workflow follows the Agent → Execute Workflow Trigger recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"id": "k2Tspf5WURvgp7Xj",
"meta": {
"templateCredsSetupCompleted": true
},
"name": "AI-Powered Report Discovery Agent",
"tags": [],
"nodes": [
{
"id": "0d42d70c-807a-4a8b-9e9d-65fdb38826c9",
"name": "Sticky Note - Introduction",
"type": "n8n-nodes-base.stickyNote",
"position": [
-352,
80
],
"parameters": {
"width": 504,
"height": 676,
"content": "## AI-Powered Report Discovery Agent\n\nUse AI to browse publication websites and identify the latest relevant downloadable reports.\n\n### How it works \n- **Trigger Sources**: Initiates from manual trigger, scheduled daily, or another workflow. \n- **Source Data**: Reads active report sources from Google Sheets (e.g., \"Report Sources\"). \n- **Process Pages**: Loops over sources, fetches HTML publication pages, and converts to Markdown. \n- **AI Extraction**: Uses AI to identify and extract the most recent, relevant downloadable report (PDF, DOCX, etc.). \n- **Validation**: Verifies the report's validity (direct download link, correct format, etc.). \n- **Save/Log**: Saves valid reports to \"Discovered Reports\" in Google Sheets. Logs no report found to \"Discovery Log\". \n- **Completion Summary**: Records the number of sources processed and the timestamp of completion.\n\n### Setup steps \n1. **Google Sheets Integration**: Connect your Google Sheets account and provide credentials. \n2. **Configure Sheets**: Set sheet names for \"Report Sources\", \"Discovered Reports\", and \"Discovery Log\". \n3. **Configure Trigger**: Choose how to trigger the workflow (manual, scheduled, or from another workflow). \n4. **Run Workflow**: Activate and monitor the workflow for discovering reports and logging the results.\n"
},
"typeVersion": 1
},
{
"id": "b7415f77-8fda-4849-8d1c-2a9331302bdc",
"name": "Manual Trigger",
"type": "n8n-nodes-base.manualTrigger",
"position": [
224,
176
],
"parameters": {},
"typeVersion": 1
},
{
"id": "fa175f95-4bb3-465b-a38a-c51b39077523",
"name": "Schedule (Daily)",
"type": "n8n-nodes-base.scheduleTrigger",
"position": [
224,
368
],
"parameters": {
"rule": {
"interval": [
{}
]
}
},
"typeVersion": 1.2
},
{
"id": "ef6fd252-30d6-4ece-863f-976aeed79a42",
"name": "Called by Another Workflow",
"type": "n8n-nodes-base.executeWorkflowTrigger",
"position": [
224,
560
],
"parameters": {
"inputSource": "passthrough"
},
"typeVersion": 1.1
},
{
"id": "65f6ead5-c16c-44c1-af09-7fc51341fa34",
"name": "Read Active Sources",
"type": "n8n-nodes-base.googleSheets",
"position": [
544,
368
],
"parameters": {
"sheetName": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "Report Sources"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "YOUR_SPREADSHEET"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"name": "<your credential>"
}
},
"typeVersion": 4.7
},
{
"id": "961d9fa1-2cf5-447a-93c4-676708f6759c",
"name": "Loop Over Sources",
"type": "n8n-nodes-base.splitInBatches",
"position": [
816,
368
],
"parameters": {
"options": {
"reset": false
}
},
"typeVersion": 3
},
{
"id": "43b9d8cd-3ded-48a8-95a4-511da002cbe7",
"name": "Fetch Publication Page",
"type": "n8n-nodes-base.httpRequest",
"onError": "continueRegularOutput",
"position": [
1264,
384
],
"parameters": {
"url": "={{ $json.Source_URL }}",
"options": {
"timeout": 30000
},
"sendHeaders": true,
"headerParameters": {
"parameters": [
{
"name": "User-Agent",
"value": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/0.0.0.0 Safari/537.36"
},
{
"name": "Accept",
"value": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
},
{
"name": "Accept-Language",
"value": "en-US,en;q=0.5"
}
]
}
},
"typeVersion": 4.2
},
{
"id": "18b5b93b-572a-48c6-9f47-f9ba63fa51e5",
"name": "Convert HTML to Markdown",
"type": "n8n-nodes-base.markdown",
"onError": "continueRegularOutput",
"position": [
1488,
384
],
"parameters": {
"html": "={{ $json.data }}",
"options": {}
},
"typeVersion": 1
},
{
"id": "c34ea7e5-ca8d-425d-b233-14d6f1ac229d",
"name": "AI Report Discovery Agent",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
1712,
256
],
"parameters": {
"text": "={{ $json.data }}",
"options": {
"systemMessage": "# \ud83d\udd0d Report Discovery Agent\n\nYou are a **Report Discovery Agent** operating inside an n8n AI workflow.\nYour role is to analyze publication pages and identify the **latest downloadable reports, research papers, or data files**.\n\n---\n\n## \ud83c\udfaf OBJECTIVE\n\nAnalyze the provided page content and extract information about the **most recent and relevant downloadable report**.\n\n**You must return exactly ONE report** \u2014 the most recent and relevant one on the page.\n\n---\n\n## \ud83d\udccb EXTRACTION RULES\n\n### 1. **Identify Downloadable Content**\n- Look for links to: PDFs, Excel files, Word documents, PowerPoint presentations\n- Common patterns: \"Download\", \"Report\", \"Research\", \"Analysis\", \"White Paper\", \"Study\"\n- File extensions: `.pdf`, `.xlsx`, `.xls`, `.doc`, `.docx`, `.pptx`\n\n### 2. **Prioritize Recency**\n- Select the **most recently published** report\n- Look for date indicators: publication dates, issue numbers, version numbers\n- If multiple reports exist, choose the one at the top of the list (usually newest)\n\n### 3. **Validate Links**\n- Ensure the link is a **direct download URL** (not a landing page)\n- Convert relative URLs to absolute URLs using the base domain\n- Exclude navigation links, category pages, or signup pages\n\n### 4. **Extract Metadata**\n- **title**: The report's official title or headline\n- **link**: Full absolute URL to download the file\n- **file_type**: The file format (pdf, xlsx, doc, pptx)\n- **description**: Brief 1-2 sentence summary of what the report covers\n\n---\n\n## \u26a0\ufe0f IMPORTANT RULES\n\n1. **Always return valid JSON** matching the schema\n2. **Never return null for required fields** \u2014 use \"Unknown\" if not found\n3. **Links must be absolute URLs** starting with http:// or https://\n4. **Only return ONE report** \u2014 the best/newest match\n5. **If no valid report found**, return the schema with \"No report found\" as title\n\n---\n\n## \ud83d\udcd6 EXAMPLES\n\n**Good Output:**\n```json\n{\n \"source\": \"Industry Research Corp\",\n \"title\": \"Q4 2024 Market Analysis Report\",\n \"link\": \"https://example.com/reports/q4-2024-analysis.pdf\",\n \"file_type\": \"pdf\",\n \"description\": \"Comprehensive analysis of Q4 2024 market trends and forecasts.\"\n}\n```\n\n**If No Report Found:**\n```json\n{\n \"source\": \"Industry Research Corp\",\n \"title\": \"No report found\",\n \"link\": \"\",\n \"file_type\": \"\",\n \"description\": \"No downloadable reports were found on this page.\"\n}\n```"
},
"promptType": "define",
"hasOutputParser": true
},
"typeVersion": 2.2
},
{
"id": "5bb62ec3-d3e9-4523-8d9b-8459680a260b",
"name": "Structured Output Parser",
"type": "@n8n/n8n-nodes-langchain.outputParserStructured",
"position": [
1856,
480
],
"parameters": {
"jsonSchemaExample": "{\n \"source\": \"Publisher Name\",\n \"title\": \"Report Title\",\n \"link\": \"https://example.com/report.pdf\",\n \"file_type\": \"pdf\",\n \"description\": \"Brief description of the report content\"\n}"
},
"typeVersion": 1.3
},
{
"id": "d6edbc05-c767-49c3-ab7b-cb93d9116036",
"name": "Validate & Normalize Output",
"type": "n8n-nodes-base.code",
"position": [
2064,
256
],
"parameters": {
"jsCode": "// Extract and validate AI output\nconst results = [];\n\nfor (const item of items) {\n const output = item.json.output || item.json;\n const sourceData = $('Loop Over Sources').item.json;\n \n // Get values with fallbacks\n const source = output.source || sourceData.Source_Name || \"Unknown\";\n const title = output.title || \"No title\";\n const link = output.link || \"\";\n const fileType = output.file_type || \"\";\n const description = output.description || \"\";\n \n // Validate the result\n const isValid = link && \n link.startsWith(\"http\") && \n title !== \"No report found\" &&\n title !== \"\";\n \n // Determine status\n let status = \"Discovered\";\n if (!isValid) {\n status = \"No Report Found\";\n } else if (!link.includes(\".pdf\") && !link.includes(\".xlsx\") && !link.includes(\".doc\")) {\n status = \"Link May Not Be Direct Download\";\n }\n \n results.push({\n json: {\n source: source,\n title: title,\n link: link,\n fileType: fileType,\n description: description,\n sourceUrl: sourceData.Source_URL || \"\",\n category: sourceData.Category || \"General\",\n discoveredAt: new Date().toISOString(),\n status: status,\n isValid: isValid\n }\n });\n}\n\nreturn results;"
},
"typeVersion": 2
},
{
"id": "c065a81e-0f3b-49fa-87d5-6a189c3d43af",
"name": "Valid Report Found?",
"type": "n8n-nodes-base.if",
"position": [
2288,
256
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 2,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "valid-check",
"operator": {
"type": "boolean",
"operation": "equals"
},
"leftValue": "={{ $json.isValid }}",
"rightValue": true
}
]
}
},
"typeVersion": 2.2
},
{
"id": "1759a157-62dd-4973-afa8-57547b808457",
"name": "Save Discovered Report",
"type": "n8n-nodes-base.googleSheets",
"position": [
2768,
480
],
"parameters": {
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "Discovered Reports"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "YOUR_SPREADSHEET"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"name": "<your credential>"
}
},
"typeVersion": 4.7
},
{
"id": "e6c67ab6-4c8e-4e27-b720-90a33b091c79",
"name": "Log No Report Found",
"type": "n8n-nodes-base.googleSheets",
"position": [
2528,
480
],
"parameters": {
"operation": "appendOrUpdate",
"sheetName": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "Discovery Log"
},
"documentId": {
"__rl": true,
"mode": "list",
"value": "",
"cachedResultUrl": "",
"cachedResultName": "YOUR_SPREADSHEET"
}
},
"credentials": {
"googleSheetsOAuth2Api": {
"name": "<your credential>"
}
},
"typeVersion": 4.7
},
{
"id": "312b3ff5-753a-4f31-aadd-74484b799d0f",
"name": "Completion Summary",
"type": "n8n-nodes-base.set",
"position": [
1024,
256
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "count",
"name": "sourcesChecked",
"type": "number",
"value": "={{ $items().length }}"
},
{
"id": "timestamp",
"name": "completedAt",
"type": "string",
"value": "={{ $now.toISO() }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "5e572710-10e7-43ba-9cd3-acf97cb80c30",
"name": "OpenAI GPT-5.1",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
"position": [
1712,
480
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "gpt-5.1",
"cachedResultName": "gpt-5.1"
},
"options": {
"temperature": 0.1
}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.2
},
{
"id": "6ce4884b-3b66-40de-9c4b-cdf858a81b1f",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
448,
112
],
"parameters": {
"color": 7,
"width": 272,
"height": 560,
"content": "## Read the Source URLs"
},
"typeVersion": 1
},
{
"id": "e96b1aee-6d93-4596-82db-167abfa514ba",
"name": "Sticky Note1",
"type": "n8n-nodes-base.stickyNote",
"position": [
1216,
112
],
"parameters": {
"color": 7,
"width": 400,
"height": 560,
"content": "## Fetch the Publication and Convert to Markdown"
},
"typeVersion": 1
},
{
"id": "255ff434-2189-4e76-8cbb-dde9d6d29bf0",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
1664,
112
],
"parameters": {
"color": 7,
"width": 320,
"height": 560,
"content": "## Process the Publication using the LLM"
},
"typeVersion": 1
},
{
"id": "b3495bdd-01ae-47f2-bde4-c64341692c0c",
"name": "Sticky Note3",
"type": "n8n-nodes-base.stickyNote",
"position": [
2016,
112
],
"parameters": {
"color": 7,
"width": 432,
"height": 560,
"content": "## Validate the AI Output\n"
},
"typeVersion": 1
},
{
"id": "9e6d8ea9-7de5-4225-bacf-b300611a6eb9",
"name": "Sticky Note4",
"type": "n8n-nodes-base.stickyNote",
"position": [
2480,
112
],
"parameters": {
"color": 7,
"width": 448,
"height": 560,
"content": "## Log the Results in Google Sheets\n"
},
"typeVersion": 1
}
],
"active": false,
"settings": {
"executionOrder": "v1"
},
"versionId": "f3ec8d20-c4e7-4c50-9f3c-ae87725de7e6",
"connections": {
"Manual Trigger": {
"main": [
[
{
"node": "Read Active Sources",
"type": "main",
"index": 0
}
]
]
},
"OpenAI GPT-5.1": {
"ai_languageModel": [
[
{
"node": "AI Report Discovery Agent",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Schedule (Daily)": {
"main": [
[
{
"node": "Read Active Sources",
"type": "main",
"index": 0
}
]
]
},
"Loop Over Sources": {
"main": [
[
{
"node": "Completion Summary",
"type": "main",
"index": 0
}
],
[
{
"node": "Fetch Publication Page",
"type": "main",
"index": 0
}
]
]
},
"Log No Report Found": {
"main": [
[
{
"node": "Loop Over Sources",
"type": "main",
"index": 0
}
]
]
},
"Read Active Sources": {
"main": [
[
{
"node": "Loop Over Sources",
"type": "main",
"index": 0
}
]
]
},
"Valid Report Found?": {
"main": [
[
{
"node": "Save Discovered Report",
"type": "main",
"index": 0
}
],
[
{
"node": "Log No Report Found",
"type": "main",
"index": 0
}
]
]
},
"Fetch Publication Page": {
"main": [
[
{
"node": "Convert HTML to Markdown",
"type": "main",
"index": 0
}
]
]
},
"Save Discovered Report": {
"main": [
[
{
"node": "Loop Over Sources",
"type": "main",
"index": 0
}
]
]
},
"Convert HTML to Markdown": {
"main": [
[
{
"node": "AI Report Discovery Agent",
"type": "main",
"index": 0
}
]
]
},
"Structured Output Parser": {
"ai_outputParser": [
[
{
"node": "AI Report Discovery Agent",
"type": "ai_outputParser",
"index": 0
}
]
]
},
"AI Report Discovery Agent": {
"main": [
[
{
"node": "Validate & Normalize Output",
"type": "main",
"index": 0
}
]
]
},
"Called by Another Workflow": {
"main": [
[
{
"node": "Read Active Sources",
"type": "main",
"index": 0
}
]
]
},
"Validate & Normalize Output": {
"main": [
[
{
"node": "Valid Report Found?",
"type": "main",
"index": 0
}
]
]
}
}
}
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
googleSheetsOAuth2ApiopenAiApi
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
AI-Powered Content Analysis - Uses advanced language models (GPT-4/GPT-5.1) to understand page context and identify downloadable reports, even when links aren't explicitly labeled, handling complex page layouts and dynamic content Structured Output Parsing - Enforces JSON schema…
Source: https://n8n.io/workflows/11232/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
AI Blog Publisher – Automated Blog Content Workflow This workflow is designed for individuals and teams who regularly publish content on their blog and want to automate the entire process from start t
Automatically publish blog content to WordPress with AI-generated branded images, internal linking, and client reporting using Google Sheets, OpenAI, and Gemini
Automated Research Report Generation with OpenAI, Wikipedia, Google Search, and Gmail/Telegram. Uses lmChatOpenAi, memoryBufferWindow, toolHttpRequest, agent. Event-driven trigger; 26 nodes.
This workflow automates the process of generating professional research reports for researchers, students, and professionals. It eliminates manual research and report formatting by aggregating data, g