This workflow corresponds to n8n.io template #12938 — we link there as the canonical source.

This workflow follows the Agent → Chat Trigger recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json

{
  "id": "kVnbIQaqYFI6xtCw",
  "name": "arXiv Autonomous Research Agent",
  "tags": [],
  "nodes": [
    {
      "id": "a0d67119-0e16-4fc8-908b-3e04aac838c6",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2064,
        -448
      ],
      "parameters": {
        "width": 420,
        "height": 1344,
        "content": "## Try It Out!\n\n### This n8n template demonstrates how to perform intelligent literature search and summarization using AI, grounded in real arXiv papers.\n\nIt allows you to ask research questions and get **accurate, cited answers** based on the latest publications.\n\nUse cases include:\n- Quickly surveying recent papers in a field\n- Summarizing research trends\n- Collecting references for academic writing or review articles\n\n---\n\n### How it works\n\n* A **user query** is received via the Chat Trigger node.\n* The **Planning Agent** decides whether the question requires a general knowledge answer or a research-oriented response.\n* If a research query is detected, the **arXiv Search** node queries the arXiv API and retrieves recent relevant papers.\n* **JSON Parsers** process the API response and extract metadata such as titles, abstracts, and links.\n* The **arXiv-Grounded Agent** summarizes each paper and generates a final answer to the user question based strictly on retrieved content.\n* The final response includes **summaries and clickable citations** from arXiv.\n\n---\n\n### How to use\n\n* The **manual trigger node** is used for demonstration, but you can replace it with **webhooks, forms, or other triggers**.\n* You can adjust the **number of papers retrieved** or refine the search query logic to suit your needs.\n* The workflow does not require a vector database \u2014 it works entirely with metadata and abstracts.\n\n---\n\n### Requirements\n\n* **OpenAI account** for LLM access\n* **n8n** installed via Docker or cloud\n* (Optional) **.env file** for configuration of ports \n---\n\n### Need Help?\n\nSee slef-hosting instructions on [GitHub](https://github.com/msilaev/n8n-arxiv-search-agent)\n\nAsk in the [n8n Forum](https://community.n8n.io/)!\n\nHappy Researching!\n"
      },
      "typeVersion": 1
    },
    {
      "id": "c5e753f0-c9e2-4e12-8fdb-6d1266693e58",
      "name": "arXiv Search",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -496,
        0
      ],
      "parameters": {
        "url": "=https://export.arxiv.org/api/query?search_query={{ \n  $json.search_query.map(phrase => {\n    // Split the phrase into individual words and prefix each with 'all:'\n    // Example: \"diffusion models audio\" becomes (all:diffusion+AND+all:models+AND+all:audio)\n    const terms = phrase.split(' ').map(word => 'all:' + encodeURIComponent(word.replace(/-/g, ' ')));\n    return '(' + terms.join('+AND+') + ')';\n  }).join('+OR+') \n}}&sortBy=submittedDate&sortOrder=descending&max_results={{ $json.max_papers || 15 }}\n",
        "options": {}
      },
      "typeVersion": 4
    },
    {
      "id": "95bf0587-7bb0-4591-9609-3aa25b02b4a7",
      "name": "Normalize arXiv",
      "type": "n8n-nodes-base.function",
      "position": [
        -208,
        0
      ],
      "parameters": {
        "functionCode": "const xml = $json;\nreturn [{ json: { raw: xml } }];"
      },
      "typeVersion": 1
    },
    {
      "id": "3aac8650-bb64-4a9c-8f83-64be4a2a287c",
      "name": "Planning Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -1360,
        96
      ],
      "parameters": {
        "options": {
          "systemMessage": "=You are a STRICT research planning agent.\n\nYour ONLY task is to decide whether NEW academic papers from arXiv\nmust be fetched to answer the user's last question.\n\nYou MUST assume that:\n- An internal vector database exists\n- It may already contain relevant papers\n- It may be INCOMPLETE\n\nYou must NOT:\n- Answer the question\n- Summarize content\n- Use prior agent responses\n- Ask follow-up questions\n\nDecision rules (follow strictly):\n\nSet need_search = false IF the question:\n- is definitional or introductory\n- is widely known\n- can likely be answered using general knowledge\n- can likely be answered using existing vector database content\n\nSet need_search = true IF the question:\n- concerns recent advances or state-of-the-art\n- involves mechanisms, architectures, or theory\n- requires comparison of research methods\n- might require academic citations\n- might NOT be sufficiently answered by the existing database\n\nIf unsure, choose need_search = true.\n\nOutput ONLY valid JSON.\nNo explanations.\nNo extra text.\n\nJSON schema:\n{\n  \"need_search\": true | false,\n  \"topic\": \"short topic label\",\n  \"search_query\": [\"query1\", \"query2\", \"query3\"],\n  \"max_papers\": 15,\n  \"iterate\": true | false\n  \"initial user question\": [\"question\"]\n}\n"
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "3a0d1395-72a6-49ce-a33c-bc66fb6b787d",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1288,
        320
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini"
        },
        "options": {
          "topP": 0.9,
          "maxTokens": 5000,
          "maxRetries": 10,
          "temperature": 0.7
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "68a58dbc-0b04-481d-9d0d-cb170cf27e18",
      "name": "Json Parser 1",
      "type": "n8n-nodes-base.code",
      "position": [
        -1008,
        96
      ],
      "parameters": {
        "jsCode": "const raw = $json.output;\n\n// Remove markdown fences\nconst cleaned = raw\n  .replace(/```json/g, '')\n  .replace(/```/g, '')\n  .trim();\n\nreturn [\n  {\n    json: JSON.parse(cleaned)\n  }\n];\n\n"
      },
      "typeVersion": 2
    },
    {
      "id": "8289b228-b8bc-4f12-8030-e5f2e0ffb468",
      "name": "Check research q",
      "type": "n8n-nodes-base.if",
      "position": [
        -784,
        96
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "ab646c26-5a01-4b6a-9a4d-4531ac1843f2",
              "operator": {
                "type": "boolean",
                "operation": "true",
                "singleValue": true
              },
              "leftValue": "={{$json.need_search}}",
              "rightValue": "=true"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "b2a8ca67-f360-42bb-88bd-e7d48aec6cad",
      "name": "Json Parser 2",
      "type": "n8n-nodes-base.code",
      "position": [
        16,
        0
      ],
      "parameters": {
        "jsCode": "// Input: $json.raw.data\nconst xml = $json.raw.data;\n\n// Helper to extract single tag text\nfunction extractTag(tag, text) {\n  const match = text.match(new RegExp(`<${tag}[^>]*>([\\\\s\\\\S]*?)</${tag}>`, 'i'));\n  return match ? match[1].trim() : null;\n}\n\n// Helper to extract multiple tags\nfunction extractAllTags(tag, text) {\n  const regex = new RegExp(`<${tag}[^>]*>([\\\\s\\\\S]*?)</${tag}>`, 'gi');\n  let matches = [];\n  let m;\n  while ((m = regex.exec(text)) !== null) {\n    matches.push(m[1].trim());\n  }\n  return matches;\n}\n\n// Split entries\nconst entryBlocks = xml.split('<entry>').slice(1);\n\n// Get original user prompt safely\nconst originalPrompt =\n  $node[\"Check research q\"]?.json?.[\"initial user question\"] ?? null;\n\n// Build papers as n8n items\nconst papers = entryBlocks.map(entryXml => {\n  // Authors\n  const authorBlocks = entryXml.split('<author>').slice(1);\n  const authors = authorBlocks.map(a => extractTag('name', a)).filter(Boolean);\n\n  // Links\n  const linkTags = entryXml.match(/<link[^>]+\\/>/gi) || [];\n  let pdf_link = null;\n  let arxiv_link = null;\n\n  linkTags.forEach(l => {\n    const href = l.match(/href=\"([^\"]+)\"/)?.[1];\n    const rel = l.match(/rel=\"([^\"]+)\"/)?.[1];\n    const title = l.match(/title=\"([^\"]+)\"/)?.[1];\n\n    if (title === 'pdf' && href) pdf_link = href;\n    if (rel === 'alternate' && href) arxiv_link = href;\n  });\n\n  // Categories\n  const categories = extractAllTags('category', entryXml)\n    .map(c => c.match(/term=\"([^\"]+)\"/)?.[1])\n    .filter(Boolean);\n\n  return {\n    json: {\n      title: extractTag('title', entryXml),      \n      summary: extractTag('summary', entryXml),\n      arxiv_link      \n    }\n  };\n});\n\nreturn [\n  {\n    json: {\n      papers,\n      original_prompt: originalPrompt\n    }\n  }\n];\n\n"
      },
      "typeVersion": 2
    },
    {
      "id": "bc7d470d-f728-4f25-91ca-b8da17c0c6ec",
      "name": "General Reasoning Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -560,
        192
      ],
      "parameters": {
        "text": "={{ $json['initial user question'][0] }}",
        "options": {
          "maxIterations": 5,
          "systemMessage": "=You are a general knowledge assistant.\n\nYour task:\n\nExplicitly state that the user\u2019s question is general and that your answer is based on general knowledge.\n\nProvide a direct, concise answer using general knowledge.\n\nRules:\n\nDo NOT answer the user question using external sources, papers, or unpublished data.\n\nDo NOT speculate or invent information.\n\nDo NOT rephrase the user\u2019s question.\n\nIf no general knowledge is available to answer the question, respond exactly: \"No relevant stored papers found.\""
        },
        "promptType": "define"
      },
      "typeVersion": 2.2
    },
    {
      "id": "784fa2a5-5781-46fd-9c76-1a13bb9fd676",
      "name": "Arxiv Grounded Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        240,
        0
      ],
      "parameters": {
        "text": "={{ $json.original_prompt[0] }}\n\n{{ $json.papers.map(d => [\n    d.json.title || \"No title\",\n    d.json.summary || \"No summary\",\n    d.json.arxiv_link || \"No link\"\n].join('\\n')).join('\\n\\n') }}\n\n",
        "options": {
          "maxIterations": 10,
          "systemMessage": "=You are a summarization research assistant.\n\nYour task:\n\nSummarize the academic content from the provided documents.\n\nGroup content by paper when possible.\n\nInclude references derived from the input metadata.\n\nRules (follow strictly):\n\nDo NOT answer the user\u2019s question with your own knowledge.\n\nDo NOT add background information.\n\nDo NOT speculate or infer beyond the provided text.\n\nDo NOT restate or rephrase the user\u2019s question.\n\nDo NOT invent paper titles, authors, or citations.\n\nUse a paper title only if it is explicitly present in the input.\n\nIf multiple excerpts belong to the same paper, merge them into one summary.\n\nInclude references if a DOI, arXiv link, or other explicit reference is provided.\n\nIf no reference is present for a paper, include only the summary.\n\nSeparate summaries of different papers with two empty lines.\n\nEmpty-input rule:\n\nIf the input contains no papers, or all documents are empty or lack substantive academic content, output exactly:\n\nNo relevant stored papers found.\n\n\nOutput format:\nFor each paper, write:\n\nTitle: [paper title or leave blank if none]\n\nSummary: [summary text based strictly on the input]\n\nReference: [DOI, arXiv, or other reference if present; otherwise leave blank]\n\n\nSeparate each paper summary by two blank lines.\n\nAnswer the original user question using only the information explicitly provided in the input.\n\n\nMake the output easy to read, not JSON."
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.2
    },
    {
      "id": "509b660a-e37c-42af-bff8-74adf3b9d13e",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1402,
        -80
      ],
      "parameters": {
        "color": 7,
        "width": 756,
        "height": 336,
        "content": "## General or research question?\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "fdf953f1-b2a9-4701-b0cc-c550acaa86df",
      "name": "User asks a question",
      "type": "@n8n/n8n-nodes-langchain.chatTrigger",
      "position": [
        -1584,
        96
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.1
    },
    {
      "id": "e9a01b4e-7265-478f-a005-bc9632671b49",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -538,
        -176
      ],
      "parameters": {
        "color": 7,
        "width": 692,
        "height": 336,
        "content": "## API request to arXiv and response parsing\n"
      },
      "typeVersion": 1
    },
    {
      "id": "c8960751-0fc5-4b26-898f-9af09bfc9c1e",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        158,
        -176
      ],
      "parameters": {
        "color": 7,
        "width": 388,
        "height": 336,
        "content": "## Reply to Question Based on Documents Retrieved from arXiv\n"
      },
      "typeVersion": 1
    },
    {
      "id": "5584be47-5f13-4e59-a44c-6b6fc81201be",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -592,
        176
      ],
      "parameters": {
        "color": 7,
        "width": 388,
        "height": 336,
        "content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n## Reply to Question Based on general knowlege\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "4990cb83-3301-4b27-881b-bd5fb65a662c",
  "connections": {
    "arXiv Search": {
      "main": [
        [
          {
            "node": "Normalize arXiv",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Json Parser 1": {
      "main": [
        [
          {
            "node": "Check research q",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Json Parser 2": {
      "main": [
        [
          {
            "node": "Arxiv Grounded Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Planning Agent": {
      "main": [
        [
          {
            "node": "Json Parser 1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Normalize arXiv": {
      "main": [
        [
          {
            "node": "Json Parser 2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check research q": {
      "main": [
        [
          {
            "node": "arXiv Search",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "General Reasoning Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "General Reasoning Agent",
            "type": "ai_languageModel",
            "index": 0
          },
          {
            "node": "Arxiv Grounded Agent",
            "type": "ai_languageModel",
            "index": 0
          },
          {
            "node": "Planning Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "User asks a question": {
      "main": [
        [
          {
            "node": "Planning Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

A user query* is received via the Chat Trigger node. The Planning Agent* decides whether the question requires a general knowledge answer or a research-oriented response. If a research query is detected, the arXiv Search* node queries the arXiv API and retrieves recent relevant…

Source: https://n8n.io/workflows/12938/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Generate Stock Market Investment Reports From Financialdatasets.ai with AI

💰 Beginner Investor – Learn the market faster with AI-powered insights guiding your decisions. 📈 Retail Trader – Optimize your trading strategy with in-depth analysis typically reserved for profession

HTTP Request, Output Parser Structured, OpenAI Chat +4

AI & RAG

Build & Deploy Mvps From Text Prompts with Ai, Github & Vercel

by Varritech Technologies

Chat Trigger, Agent, OpenAI Chat +8

AI & RAG

Generate Horror Faceless Shorts with Openai Tts, Replicate Video, and Youtube Upload

Who’s it for Creators who want to create faceless videos automatically, while keeping human oversight and quality control.

Read Write File, Agent, OpenAI Chat +7

AI & RAG

The Best Linkedin Posting System

The Best Linkedin Posting System. Uses httpRequest, lmChatOpenAi, agent, chatTrigger. Chat trigger; 49 nodes.

HTTP Request, OpenAI Chat, Agent +8

AI & RAG

Crawl Websites & Answer Questions with Gpt-5 Nano and Google Sheets

Who is this workflow for? This workflow is designed for SEO analysts, content creators, marketing agencies, and developers who need to index a website and then interact with its content as if it were

Agent, OpenAI Chat, Memory Buffer Window +10

Answer Research Questions Using Openai Gpt-4.1 and Arxiv Papers

The workflow JSON

About this workflow

Related workflows