AutomationFlowsAI & RAG › Arxiv Paper Summarization with Chatgpt

Arxiv Paper Summarization with Chatgpt

ByTeddy @teddysbeach on n8n.io

This workflow is designed for researchers, students, and professionals who frequently read academic papers and need concise summaries. It is useful for anyone who wants to quickly extract key information from research papers hosted on arXiv.

Webhook trigger★★★★☆ complexityAI-powered12 nodesChain SummarizationOpenAI ChatHTTP RequestOpenAIInformation Extractor
AI & RAG Trigger: Webhook Nodes: 12 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #2904 — we link there as the canonical source.

This workflow follows the Chainsummarization → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "5IAbyLhZX99QS1ff",
  "name": "Webhook | Paper Summarization",
  "tags": [],
  "nodes": [
    {
      "id": "cf2bfb6d-c5ae-46e8-9382-274f37129291",
      "name": "Summarization Chain",
      "type": "@n8n/n8n-nodes-langchain.chainSummarization",
      "position": [
        1000,
        0
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 2
    },
    {
      "id": "b1ab5c2c-f7df-4f2b-bf2d-e3f11a76b691",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        1000,
        140
      ],
      "parameters": {
        "model": "gpt-3.5-turbo",
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "da17d97e-218a-466d-b356-ccdb63120626",
      "name": "Request to Paper Page",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        340,
        0
      ],
      "parameters": {
        "url": "=https://arxiv.org/html/{{ $json.query.id }}",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "034a05a8-78ef-4037-a41d-e2ae7e747378",
      "name": "Extract Contents",
      "type": "n8n-nodes-base.html",
      "position": [
        500,
        0
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "abstract",
              "cssSelector": "div.ltx_abstract"
            },
            {
              "key": "sections",
              "cssSelector": "div.ltx_para",
              "returnArray": true
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "88cae164-4e9c-40ab-bbd7-b070863d222d",
      "name": "Split out All Sections",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        660,
        0
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "sections"
      },
      "typeVersion": 1
    },
    {
      "id": "047ca863-0ffc-40e2-ac1e-57cdea011063",
      "name": "Remove useless links",
      "type": "n8n-nodes-base.set",
      "position": [
        840,
        0
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "4a821a88-9adc-4f9e-8a29-25b03bf7f5a3",
              "name": "sections",
              "type": "string",
              "value": "={{ $json.sections.replaceAll(/\\[.*?\\]/g, '')}}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "3ccb4f31-fd75-4eac-be93-08b22c48ea7f",
      "name": "Aggregate summarzied content",
      "type": "n8n-nodes-base.aggregate",
      "position": [
        1300,
        0
      ],
      "parameters": {
        "options": {},
        "fieldsToAggregate": {
          "fieldToAggregate": [
            {
              "fieldToAggregate": "response.text"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "f836d2c0-a28c-4704-a2cc-ec13e3dabfd7",
      "name": "Reorganize Paper Summary",
      "type": "@n8n/n8n-nodes-langchain.openAi",
      "position": [
        1440,
        0
      ],
      "parameters": {
        "modelId": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-3.5-turbo",
          "cachedResultName": "GPT-3.5-TURBO"
        },
        "options": {},
        "messages": {
          "values": [
            {
              "role": "system",
              "content": "=Based on the provided research paper text, generate a summary divided into the following four sections. Each section must include the required details listed below:\n\nAbstract Overview:\n\nSummarize the research topic, objectives, and methodology.\nHighlight the main results and conclusions.\nConvey the overall core message of the paper.\nIntroduction:\n\nOutline the background and motivation behind the research.\nDiscuss existing literature, emphasizing differences or gaps.\nClearly state the necessity and objectives of the study.\nResults:\n\nPresent the key experimental results and data analysis.\nHighlight significant findings, including any important figures or data points if applicable.\nProvide a brief interpretation of the results.\nConclusion:\n\nSummarize the implications and significance of the findings.\nMention any limitations of the study.\nOffer suggestions for future research and state the final conclusions.\nEnsure that each section includes all critical details while avoiding unnecessary elaboration. The summary should flow logically from one section to the next, reflecting the overall structure and content of the original paper."
            },
            {
              "content": "={{ $json.text.join('|') }}"
            }
          ]
        },
        "simplify": false
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.7
    },
    {
      "id": "99d0edeb-5265-45c8-ad5f-7158bbd0e5d8",
      "name": "Content Extractor",
      "type": "@n8n/n8n-nodes-langchain.informationExtractor",
      "position": [
        1740,
        0
      ],
      "parameters": {
        "text": "={{ $json.choices[0].message.content }}",
        "options": {},
        "attributes": {
          "attributes": [
            {
              "name": "Abstract Overview",
              "required": true,
              "description": "the abstract overview in short"
            },
            {
              "name": "Introduction",
              "required": true,
              "description": "Describe the context, motivation, and problem statement, indeed."
            },
            {
              "name": "Results",
              "required": true,
              "description": "Outline the main results or findings of the study, indeed."
            },
            {
              "name": "Conclusion",
              "required": true,
              "description": "Conclude with the overall achievements and contributions of the paper, indeed."
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "71e4f88d-a13f-4d1e-a6af-73c3570dc1b7",
      "name": "OpenAI Chat Model1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        1920,
        140
      ],
      "parameters": {
        "model": "gpt-3.5-turbo",
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "b82b133e-b2e5-4ed6-9157-f0b4cdd3e4d3",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "position": [
        140,
        0
      ],
      "parameters": {
        "path": "paper-summarization",
        "options": {},
        "responseMode": "responseNode"
      },
      "typeVersion": 2
    },
    {
      "id": "1789c207-0d0d-4aaa-a6a7-0185972ce8ad",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        2060,
        0
      ],
      "parameters": {
        "options": {},
        "respondWith": "json",
        "responseBody": "={{ $json.output }}"
      },
      "typeVersion": 1.1
    }
  ],
  "active": true,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "04055cba-004d-47ed-8a08-c1c3bb47debd",
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "Request to Paper Page",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Contents": {
      "main": [
        [
          {
            "node": "Split out All Sections",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Content Extractor": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Summarization Chain",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model1": {
      "ai_languageModel": [
        [
          {
            "node": "Content Extractor",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Summarization Chain": {
      "main": [
        [
          {
            "node": "Aggregate summarzied content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Remove useless links": {
      "main": [
        [
          {
            "node": "Summarization Chain",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Request to Paper Page": {
      "main": [
        [
          {
            "node": "Extract Contents",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split out All Sections": {
      "main": [
        [
          {
            "node": "Remove useless links",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Reorganize Paper Summary": {
      "main": [
        [
          {
            "node": "Content Extractor",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate summarzied content": {
      "main": [
        [
          {
            "node": "Reorganize Paper Summary",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow is designed for researchers, students, and professionals who frequently read academic papers and need concise summaries. It is useful for anyone who wants to quickly extract key information from research papers hosted on arXiv.

Source: https://n8n.io/workflows/2904/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

The Ultimate Scraper for n8n uses Selenium and AI to retrieve any information displayed on a webpage. You can also use session cookies to log in to the targeted webpage for more advanced scraping need

OpenAI Chat, HTTP Request, Information Extractor +1
AI & RAG

This n8n template demonstrates how you can generate an AI-produced weather analysis of your local radar loop and home assistant precipitation sensor(s) to keep your family informed of National Weather

HTTP Request, OpenAI, Home Assistant +2
AI & RAG

z-Api. Uses httpRequest, openAi, redis, postgres. Webhook trigger; 61 nodes.

HTTP Request, OpenAI, Redis +4
AI & RAG

This template automates the extraction of structured data from Thai government letters received via LINE or uploaded to Google Drive. It uses Mistral AI for OCR and OpenAI for information extraction,

HTTP Request, Google Drive Trigger, Google Drive +4
AI & RAG

I'll be honest, I built this because I was getting lazy in meetings and missing key details. I started with a simple VEXA integration for transcripts, then added AI to pull out summaries and tasks. Bu

Redis, Baserow, HTTP Request +3