AutomationFlowsAI & RAG › Paperless-ngx Document Chat

Paperless-ngx Document Chat

Paperless-ngx Document Chat. Uses httpRequest, extractPdfContent, openAi. Webhook trigger; 9 nodes.

Webhook trigger★★★★☆ complexityAI-powered9 nodesHTTP RequestExtract Pdf ContentOpenAI
AI & RAG Trigger: Webhook Nodes: 9 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow follows the HTTP Request → OpenAI recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Paperless-ngx Document Chat",
  "nodes": [
    {
      "parameters": {
        "path": "paperless-chat",
        "responseMode": "responseNode",
        "options": {}
      },
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 1,
      "position": [
        240,
        300
      ],
      "notes": "This webhook accepts POST requests with a document ID and a user query in the request body. Example: { \"documentId\": 123, \"query\": \"What is the main topic of this document?\" }"
    },
    {
      "parameters": {
        "functionCode": "// Extract request data\nconst documentId = $input.item.json.body.documentId;\nconst userQuery = $input.item.json.body.query;\n\n// Validate inputs\nif (!documentId) {\n  throw new Error('Document ID is required');\n}\n\nif (!userQuery) {\n  throw new Error('Query is required');\n}\n\n// Return the data for the next nodes\nreturn {\n  documentId,\n  userQuery\n};"
      },
      "name": "Extract Request Data",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        460,
        300
      ],
      "notes": "Extracts and validates the document ID and user query from the incoming webhook request."
    },
    {
      "parameters": {
        "requestMethod": "GET",
        "url": "={{ $env.PAPERLESS_API_URL }}/api/documents/{{ $json.documentId }}/",
        "options": {
          "allowUnauthorizedCerts": true
        },
        "headerParametersUi": {
          "parameter": [
            {
              "name": "Authorization",
              "value": "Token {{ $env.PAPERLESS_TOKEN }}"
            },
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        }
      },
      "name": "Fetch Document",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 2,
      "position": [
        680,
        300
      ],
      "notes": "Fetches document metadata from Paperless-ngx API. Uses environment variables for API URL and token. Set these in your n8n environment variables or workflow variables."
    },
    {
      "parameters": {
        "requestMethod": "GET",
        "url": "={{ $env.PAPERLESS_API_URL }}/api/documents/{{ $json.documentId }}/download/",
        "options": {
          "allowUnauthorizedCerts": true,
          "response": {
            "binaryPropertyName": "document"
          }
        },
        "headerParametersUi": {
          "parameter": [
            {
              "name": "Authorization",
              "value": "Token {{ $env.PAPERLESS_TOKEN }}"
            }
          ]
        }
      },
      "name": "Download Document",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 2,
      "position": [
        900,
        300
      ],
      "notes": "Downloads the actual document file from Paperless-ngx and stores it as binary data for processing."
    },
    {
      "parameters": {
        "dataPropertyName": "document",
        "options": {}
      },
      "name": "Extract Document Text",
      "type": "n8n-nodes-base.extractPdfContent",
      "typeVersion": 1,
      "position": [
        1120,
        300
      ],
      "notes": "Extracts text content from the PDF document. If your documents are not PDFs, you may need to use a different node or custom function to extract text."
    },
    {
      "parameters": {
        "functionCode": "// Combine document metadata and content\nconst documentData = $input.item.json;\nconst textContent = $('Extract Document Text').item.json.text;\nconst userQuery = $('Extract Request Data').item.json.userQuery;\n\n// Get metadata fields\nconst title = documentData.title || 'Untitled';\nconst created_date = documentData.created_date || 'Unknown';\nconst correspondent = documentData.correspondent_name || 'Unknown';\nconst tags = documentData.tags || [];\n\n// Prepare prompt context\nconst documentContext = {\n  title,\n  created_date,\n  correspondent,\n  tags,\n  content: textContent,\n  userQuery\n};\n\nreturn documentContext;"
      },
      "name": "Prepare Document Context",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        1340,
        300
      ],
      "notes": "Combines document metadata (title, date, tags, etc.) with the extracted text content and user query to create a comprehensive context for the AI."
    },
    {
      "parameters": {
        "authentication": "apiKey",
        "apiKey": "={{ $env.OPENAI_API_KEY }}",
        "baseUrl": "https://api.openai.com",
        "model": "gpt-4",
        "prompt": {
          "promptType": "user",
          "message": "I'm going to provide you with a document and a question about that document. Please respond to the question based only on the information contained in the document.\n\nDocument Title: {{ $json.title }}\nCreated Date: {{ $json.created_date }}\nCorrespondent: {{ $json.correspondent }}\nTags: {{ $json.tags }}\n\nDocument Content:\n{{ $json.content }}\n\nUser Question: {{ $json.userQuery }}\n\nPlease provide a concise, accurate response to the question based only on the contents of this document. If the document doesn't contain information to answer the question, please state that clearly."
        },
        "options": {
          "temperature": 0.2,
          "maxTokens": 1500
        }
      },
      "name": "Process with AI",
      "type": "n8n-nodes-base.openAi",
      "typeVersion": 1,
      "position": [
        1560,
        300
      ],
      "notes": "Processes the document content and user query with OpenAI's GPT model. The prompt instructs the AI to only use information from the document to answer the question. You can adjust the model, temperature, and max tokens as needed."
    },
    {
      "parameters": {
        "keepOnlySet": true,
        "values": {
          "string": [
            {
              "name": "documentId",
              "value": "={{ $('Extract Request Data').item.json.documentId }}"
            },
            {
              "name": "query",
              "value": "={{ $('Extract Request Data').item.json.userQuery }}"
            },
            {
              "name": "documentTitle",
              "value": "={{ $('Fetch Document').item.json.title }}"
            },
            {
              "name": "response",
              "value": "={{ $('Process with AI').item.json.response }}"
            }
          ]
        },
        "options": {}
      },
      "name": "Format Response",
      "type": "n8n-nodes-base.set",
      "typeVersion": 2,
      "position": [
        1780,
        300
      ],
      "notes": "Formats the final response with relevant information including document ID, title, original query, and the AI's response."
    },
    {
      "parameters": {
        "respondWithJson": true,
        "responseData": "={{ $json }}",
        "options": {}
      },
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1,
      "position": [
        2000,
        300
      ],
      "notes": "Sends the formatted response back to the original webhook caller with the AI's answer and relevant document information."
    }
  ],
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "Extract Request Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Request Data": {
      "main": [
        [
          {
            "node": "Fetch Document",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch Document": {
      "main": [
        [
          {
            "node": "Download Document",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download Document": {
      "main": [
        [
          {
            "node": "Extract Document Text",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Document Text": {
      "main": [
        [
          {
            "node": "Prepare Document Context",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Prepare Document Context": {
      "main": [
        [
          {
            "node": "Process with AI",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Process with AI": {
      "main": [
        [
          {
            "node": "Format Response",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format Response": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1",
    "saveManualExecutions": true,
    "callerPolicy": "workflowCredentialUser",
    "errorWorkflow": ""
  },
  "staticData": null,
  "versionId": "",
  "triggerCount": 0,
  "tags": [
    "paperless-ngx",
    "document-management",
    "ai",
    "chat"
  ],
  "notes": "# Paperless-ngx Document Chat Workflow\n\n## Overview\nThis workflow allows users to chat with documents stored in Paperless-ngx by sending queries through a webhook. The workflow fetches the document from Paperless-ngx, extracts its content, processes the user's query with an AI model, and returns a relevant response.\n\n## Setup Requirements\n1. Configure the following environment variables in n8n:\n   - `PAPERLESS_API_URL`: URL of your Paperless-ngx instance (e.g., http://paperless.local:8000)\n   - `PAPERLESS_TOKEN`: API token from Paperless-ngx\n   - `OPENAI_API_KEY`: Your OpenAI API key\n\n2. Activate the webhook to receive document queries.\n\n## Using the Workflow\nSend a POST request to the webhook URL with a JSON body containing:\n```json\n{\n  \"documentId\": 123,\n  \"query\": \"What is the main topic of this document?\"\n}\n```\n\nThe workflow will respond with the AI's analysis of the document content related to your query.\n\n## Customization Options\n- Modify the AI prompt in the \"Process with AI\" node to change how the AI interprets documents\n- Adjust the OpenAI model and parameters like temperature and max tokens\n- Add additional document processing steps for different file types\n\n## Integration with MCP\nThis workflow works with the Paperless-ngx MCP integration by accessing documents directly through the Paperless-ngx API. If you're using the Model Context Protocol server from this repository, you can modify this workflow to use the MCP endpoints instead."
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Paperless-ngx Document Chat. Uses httpRequest, extractPdfContent, openAi. Webhook trigger; 9 nodes.

Source: https://github.com/PDangelmaier/paperless-ngx-n8n-integration/blob/0b459403e64703b66cd1792c0ed39fc6b0c5b3b0/examples/paperless_document_chat_workflow.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Use Case: Analyze images with multiple subjects. In this use case I have a bookshelf and am extracting and verifying book titles/authors from a bookshelf photo.

OpenAI, HTTP Request
AI & RAG

Who is this for? Event organizers, conference planners, and marketing teams fighting registration drop-off who want 4-field forms with LinkedIn-level attendee intelligence. What problem is this workfl

Data Table, HubSpot, Email Send +4
AI & RAG

The News Site from Colt, a telecom company, does not offer an RSS feed, therefore web scraping is the choice to extract and process the news.

OpenAI, HTTP Request, Noco Db +1
AI & RAG

Domain Outbound Machine is an n8n workflow designed to fully automate the domain sales process: lead generation, email extraction, personalized outreach, and automated email sending. It also stores ex

Google Sheets, HTTP Request, Gmail +1
AI & RAG

Social Media Audio Extractor. Uses telegramTrigger, telegram, openAi, httpRequest. Event-driven trigger; 31 nodes.

Telegram Trigger, Telegram, OpenAI +2