{
  "id": "ica4EwxtFqquNriy",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Set Paperless Document Title w/Tag Removal",
  "tags": [],
  "nodes": [
    {
      "id": "dac1caa7-ca26-4712-bbcf-f26f04b2cd52",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -80,
        832
      ],
      "parameters": {
        "width": 480,
        "height": 656,
        "content": "## Set Paperless Document Title w/Tag Removal\n\n### How it works\n\n1. The workflow is triggered by a webhook when a title needs updating.\n2. Validates the incoming URL provided in the webhook.\n3. Fetches document content based on the valid URL.\n4. Uses AI to generate a new document name.\n5. Removes unnecessary tags and updates the document title.\n\n### Setup steps\n\n- [ ] Configure the Webhook node to receive update requests.\n- [ ] Set up the API authentication details for all HTTP Request nodes.\n- [ ] Ensure the AI Agent has access to required models like OpenAI Chat.\n\n### Customization\n\nAdjustment of tags to remove can be customized in the \"Filter Out Tag to Remove\" code node."
      },
      "typeVersion": 1
    },
    {
      "id": "41e686d7-717f-4e22-a0bd-bbb2738aedf0",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        480,
        928
      ],
      "parameters": {
        "color": 7,
        "width": 640,
        "height": 464,
        "content": "## Trigger and validation\n\nStarts workflow on webhook and validates the URL."
      },
      "typeVersion": 1
    },
    {
      "id": "03ff65d0-e543-4ab2-9abd-848f2508cbcb",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1152,
        928
      ],
      "parameters": {
        "color": 7,
        "width": 640,
        "height": 368,
        "content": "## Fetch and evaluate document\n\nFetches document content and checks if it needs processing."
      },
      "typeVersion": 1
    },
    {
      "id": "e377cc84-6766-4b24-8db5-f59dfef5f4b2",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1584,
        832
      ],
      "parameters": {
        "color": 7,
        "width": 576,
        "height": 480,
        "content": "## AI document name generation\n\nGenerates a new document name using AI."
      },
      "typeVersion": 1
    },
    {
      "id": "fb80dceb-53f1-4aa9-9d12-9cb1bd539856",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2176,
        832
      ],
      "parameters": {
        "color": 7,
        "width": 640,
        "height": 272,
        "content": "## Tag removal and update\n\nIdentifies and removes unnecessary tags, then updates the document."
      },
      "typeVersion": 1
    },
    {
      "id": "7501a8b0-010d-4f12-a938-c35bce1553a3",
      "name": "Fetch Document Content",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1200,
        1040
      ],
      "parameters": {
        "url": "={{ $json.scheme }}://{{ $json.server }}/api/documents/{{ $json.documentId }}/",
        "options": {
          "response": {
            "response": {
              "fullResponse": true
            }
          }
        },
        "authentication": "genericCredentialType",
        "genericAuthType": "httpBasicAuth"
      },
      "credentials": {
        "httpBasicAuth": {
          "name": "<your credential>"
        }
      },
      "retryOnFail": true,
      "typeVersion": 4.3
    },
    {
      "id": "eae50ac5-0899-497c-88ee-3c9c1b5ada13",
      "name": "Apply Document Guardrails",
      "type": "@n8n/n8n-nodes-langchain.guardrails",
      "position": [
        1648,
        944
      ],
      "parameters": {
        "text": "={{ $json.body.content }}",
        "operation": "sanitize",
        "guardrails": {
          "pii": {
            "value": {
              "type": "all"
            }
          }
        }
      },
      "typeVersion": 2
    },
    {
      "id": "29811062-625a-4362-a0f4-001a80d4d5ba",
      "name": "Generate Title AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        1872,
        944
      ],
      "parameters": {
        "text": "={{ $json.guardrailsInput.substring(0, 10000) }}",
        "options": {
          "systemMessage": "Role: You are an expert Document Archivist and Information Architect. Your task is to analyze the text provided, which has been generated via OCR (Optical Character Recognition) from scanned documents.\n\nTask: Generate a concise, descriptive, and professional title for the document based on its primary subject matter, intent, and key entities (e.g., companies, dates, or specific project names).\n\nGuidelines:\n\n    Contextual Accuracy: Prioritize formal headers, dates, and recurring keywords to determine the document type (e.g., Invoice, Technical Specification, Meeting Minutes, Contract).\n\n    OCR Resilience: Ignore \"noise\" typical of OCR, such as garbled characters, misread page numbers, or fragmented headers.\n\n    Format: Output only the suggested title. Limit the length to no more than 60 characters. Do not include introductory text like \"The title is...\" or \"I suggest...\"\n\n    Naming Convention: Use a standard format: [YYYY-MM-DD] - [Document Type] - [Main Subject/Entity]. If a date is not found, omit it.\n\nTone: Professional and objective."
        },
        "promptType": "define"
      },
      "notesInFlow": false,
      "typeVersion": 3.1,
      "alwaysOutputData": false
    },
    {
      "id": "c4e951f8-75b7-4c35-8663-2ec94dfd1f87",
      "name": "OpenAI GPT-4 Mini",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        1944,
        1168
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini",
          "cachedResultName": "GPT-4o Mini"
        },
        "options": {
          "maxTokens": 4096
        },
        "builtInTools": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.3
    },
    {
      "id": "be586407-249a-4843-a02b-aa7e120010d9",
      "name": "Interrupt on Invalid URL",
      "type": "n8n-nodes-base.stopAndError",
      "position": [
        976,
        1232
      ],
      "parameters": {
        "errorMessage": "Input value for doc_url is not a url."
      },
      "typeVersion": 1
    },
    {
      "id": "c81573f7-2c8c-4f90-b440-bed570249b73",
      "name": "Set URL Components",
      "type": "n8n-nodes-base.set",
      "position": [
        976,
        1040
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "a734fa97-0bca-485b-8ce3-ae328c3d0e8d",
              "name": "server",
              "type": "string",
              "value": "={{ $json.body['x-doc-url'].extractDomain() }}"
            },
            {
              "id": "b2d29825-7452-4657-aed2-da4413f72cb7",
              "name": "scheme",
              "type": "string",
              "value": "={{ $json.body['x-doc-url'].split(':')[0] }}"
            },
            {
              "id": "853b2b68-2389-4a31-82c9-a95ad9a51f4e",
              "name": "documentId",
              "type": "string",
              "value": "={{ (() => {\nconst path = $json.body['x-doc-url'].extractUrlPath().split('/');\nconst id = path[path.length - 2];\nreturn id;\n})() }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "9890ada3-e94a-462a-bead-4a9639b61701",
      "name": "Check URL Validity",
      "type": "n8n-nodes-base.if",
      "position": [
        752,
        1136
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "187466b4-2c51-40ca-bb84-b53209fa24a8",
              "operator": {
                "type": "boolean",
                "operation": "true",
                "singleValue": true
              },
              "leftValue": "={{ $json.body['x-doc-url'].isUrl() }}",
              "rightValue": ""
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "07ecb804-889a-4f81-b687-048a4a5e0c29",
      "name": "Fetch Removal Tag ID",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        2224,
        944
      ],
      "parameters": {
        "url": "={{ $('Set URL Components').first().json.scheme }}://{{ $('Set URL Components').first().json.server }}/api/tags/?name__iexact={{ $('When Title Update Requested').item.json.body['x-tag-to-remove'] }}",
        "options": {},
        "authentication": "genericCredentialType",
        "genericAuthType": "httpBasicAuth"
      },
      "credentials": {
        "httpBasicAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.4
    },
    {
      "id": "5b685cba-fc5a-4d95-8a86-c39f9806f4a1",
      "name": "Execute Tag Removal",
      "type": "n8n-nodes-base.code",
      "position": [
        2448,
        944
      ],
      "parameters": {
        "jsCode": "// Using ?. to safely navigate and ?? to provide a default empty array\nconst tagIdToRemove = $('Fetch Removal Tag ID').first()?.json?.results?.[0]?.id;\nconst currentTags = $('Fetch Document Content').first()?.json?.body?.tags ?? [];\n\n// If tagIdToRemove is undefined, the filter simply returns the full currentTags array\nconst updatedTags = currentTags.filter(id => id !== tagIdToRemove);\n\nreturn {\n  updatedTags: updatedTags\n};"
      },
      "typeVersion": 2
    },
    {
      "id": "4b6ca323-c425-4f11-8cc9-17b67c26bb8e",
      "name": "Terminate Process",
      "type": "n8n-nodes-base.noOp",
      "position": [
        1648,
        1136
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "4001cfee-0f64-4485-9482-b3676d6e31ab",
      "name": "Apply Document Updates",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        2672,
        944
      ],
      "parameters": {
        "url": "={{ $('Set URL Components').first().json.scheme }}://{{ $('Set URL Components').first().json.server }}/api/documents/{{ $('Set URL Components').last().json.documentId }}/",
        "method": "PATCH",
        "options": {},
        "sendBody": true,
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "title",
              "value": "={{ $('Generate Title AI Agent').first().json.output }}"
            },
            {
              "name": "tags",
              "value": "={{ $json.updatedTags }}"
            }
          ]
        },
        "genericAuthType": "httpBasicAuth"
      },
      "credentials": {
        "httpBasicAuth": {
          "name": "<your credential>"
        }
      },
      "retryOnFail": true,
      "typeVersion": 4.3
    },
    {
      "id": "a5c92cd2-7a5f-43c0-a7eb-dc3361fa2726",
      "name": "If Document Content Exists",
      "type": "n8n-nodes-base.if",
      "position": [
        1424,
        1040
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 3,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "or",
          "conditions": [
            {
              "id": "5fbec441-1f29-4e12-8cc1-c2a48e814e2c",
              "operator": {
                "type": "string",
                "operation": "exists",
                "singleValue": true
              },
              "leftValue": "={{ $json.body.content }}",
              "rightValue": ""
            }
          ]
        }
      },
      "typeVersion": 2.3
    },
    {
      "id": "ce5ef9c8-ae58-478b-8e8d-9905321ad780",
      "name": "When Title Update Requested",
      "type": "n8n-nodes-base.webhook",
      "position": [
        528,
        1136
      ],
      "parameters": {
        "path": "update-document-title",
        "options": {},
        "httpMethod": "POST",
        "authentication": "headerAuth"
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2.1
    }
  ],
  "active": true,
  "settings": {
    "binaryMode": "separate",
    "callerPolicy": "workflowsFromSameOwner",
    "availableInMCP": false,
    "executionOrder": "v1"
  },
  "versionId": "aea605f3-fc39-48ca-948d-159ec0695e03",
  "connections": {
    "OpenAI GPT-4 Mini": {
      "ai_languageModel": [
        [
          {
            "node": "Generate Title AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Check URL Validity": {
      "main": [
        [
          {
            "node": "Set URL Components",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Interrupt on Invalid URL",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set URL Components": {
      "main": [
        [
          {
            "node": "Fetch Document Content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Execute Tag Removal": {
      "main": [
        [
          {
            "node": "Apply Document Updates",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch Removal Tag ID": {
      "main": [
        [
          {
            "node": "Execute Tag Removal",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Fetch Document Content": {
      "main": [
        [
          {
            "node": "If Document Content Exists",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Generate Title AI Agent": {
      "main": [
        [
          {
            "node": "Fetch Removal Tag ID",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Apply Document Guardrails": {
      "main": [
        [
          {
            "node": "Generate Title AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "If Document Content Exists": {
      "main": [
        [
          {
            "node": "Apply Document Guardrails",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Terminate Process",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When Title Update Requested": {
      "main": [
        [
          {
            "node": "Check URL Validity",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}