AutomationFlowsWeb Scraping › Automate Document Processing with Dify & Gemini

Automate Document Processing with Dify & Gemini

Original n8n title: Document Processing - Docusearch

Document Processing - DocuSearch. Uses executeWorkflowTrigger, httpRequest. Event-driven trigger; 14 nodes.

Event trigger★★★★☆ complexity14 nodesExecute Workflow TriggerHTTP Request
Web Scraping Trigger: Event Nodes: 14 Complexity: ★★★★☆ Added:

This workflow follows the Execute Workflow Trigger → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Document Processing - DocuSearch",
  "nodes": [
    {
      "parameters": {},
      "id": "workflow-trigger",
      "name": "\u30ef\u30fc\u30af\u30d5\u30ed\u30fc\u958b\u59cb",
      "type": "n8n-nodes-base.executeWorkflowTrigger",
      "typeVersion": 1,
      "position": [
        0,
        0
      ]
    },
    {
      "parameters": {
        "jsCode": "// \u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u60c5\u5831\u3092\u6e96\u5099\nconst item = $input.first();\nconst binaryData = item.binary?.data;\n\nif (!binaryData) {\n  return [{\n    json: {\n      error: 'No binary data found',\n      success: false\n    }\n  }];\n}\n\n// \u30d5\u30a1\u30a4\u30eb\u540d\u3092\u53d6\u5f97\nconst fileName = binaryData.fileName || item.json.name || 'unknown';\n\n// binaryData.directory\u304b\u3089\u30d5\u30eb\u30d1\u30b9\u3092\u69cb\u7bc9\nconst directory = binaryData.directory || item.json.directory || '/watch/documents';\nlet originalPath = `${directory}/${fileName}`;\n\n// \u547c\u3073\u51fa\u3057\u5143\u304b\u3089path\u304c\u6e21\u3055\u308c\u3066\u3044\u308b\u5834\u5408\u306f\u305d\u308c\u3092\u4f7f\u7528\nif (item.json.path) {\n  originalPath = item.json.path;\n}\n\n// \u76f8\u5bfe\u30d1\u30b9\u3092\u751f\u6210\uff08/watch/\u3092\u9664\u53bb\uff09\nlet relativePath = item.json.relativePath;\nif (!relativePath) {\n  relativePath = originalPath.replace(/^\\/watch\\//, '');\n}\n\n// \u30d5\u30a1\u30a4\u30eb\u62e1\u5f35\u5b50\u3092\u53d6\u5f97\nconst ext = fileName.split('.').pop()?.toLowerCase() || '';\n\n// \u30d5\u30a1\u30a4\u30eb\u30bf\u30a4\u30d7\u3092\u5224\u5b9a\nconst textExtensions = ['txt', 'md', 'mdx', 'csv', 'json', 'xml', 'html', 'htm', 'yaml', 'yml', 'properties'];\nconst isPdf = ext === 'pdf';\nconst isTextFile = textExtensions.includes(ext);\n\n// \u30d5\u30a1\u30a4\u30eb\u30bf\u30a4\u30d7: 'text', 'pdf', 'binary'\nlet fileType = 'binary';\nif (isTextFile) fileType = 'text';\nelse if (isPdf) fileType = 'pdf';\n\n// \u30d0\u30a4\u30ca\u30ea\u30d0\u30c3\u30d5\u30a1\u3092\u53d6\u5f97\nconst binaryBuffer = await this.helpers.getBinaryDataBuffer(0, 'data');\nconst base64String = binaryBuffer.toString('base64');\n\n// \u30c6\u30ad\u30b9\u30c8\u30d5\u30a1\u30a4\u30eb\u306e\u5834\u5408\u306fUTF-8\u3068\u3057\u3066\u30c7\u30b3\u30fc\u30c9\nlet textContent = '';\nif (isTextFile) {\n  try {\n    textContent = binaryBuffer.toString('utf-8');\n  } catch (e) {\n    textContent = '';\n  }\n}\n\nreturn [{\n  json: {\n    originalPath: originalPath,\n    relativePath: relativePath,\n    fileName: fileName,\n    extension: ext,\n    fileType: fileType,\n    isTextFile: isTextFile,\n    isPdf: isPdf,\n    textContent: textContent,\n    base64Data: base64String,\n    mimeType: binaryData.mimeType || 'application/octet-stream',\n    fileSize: binaryData.fileSize || 0\n  },\n  binary: item.binary\n}];"
      },
      "id": "prepare-document",
      "name": "\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        220,
        0
      ]
    },
    {
      "parameters": {
        "conditions": {
          "string": [
            {
              "value1": "={{ $json.fileType }}",
              "value2": "text"
            }
          ]
        }
      },
      "id": "check-is-text",
      "name": "\u30c6\u30ad\u30b9\u30c8\u5224\u5b9a",
      "type": "n8n-nodes-base.if",
      "typeVersion": 1,
      "position": [
        440,
        0
      ]
    },
    {
      "parameters": {
        "conditions": {
          "string": [
            {
              "value1": "={{ $json.fileType }}",
              "value2": "pdf"
            }
          ]
        }
      },
      "id": "check-is-pdf",
      "name": "PDF\u5224\u5b9a",
      "type": "n8n-nodes-base.if",
      "typeVersion": 1,
      "position": [
        660,
        100
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "http://dify-api:5001/v1/datasets/b6e67602-dbd0-4777-bf44-de9ab18a8a11/document/create-by-text",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"name\": \"{{ $json.relativePath }}\",\n  \"text\": \"\u25a0\u30d5\u30a1\u30a4\u30eb\u540d: {{ $json.fileName }}\\n\u25a0\u30d5\u30a1\u30a4\u30eb\u30d1\u30b9: {{ $json.relativePath }}\\n\u25a0\u30d5\u30a1\u30a4\u30ebURL: http://localhost/docs/{{ encodeURI($json.relativePath) }}\\n\\n\u25a0\u5185\u5bb9:\\n\" + {{ JSON.stringify($json.textContent) }},\n  \"indexing_technique\": \"high_quality\",\n  \"process_rule\": {\n    \"mode\": \"automatic\"\n  }\n}",
        "options": {
          "timeout": 120000
        }
      },
      "id": "dify-upload-text",
      "name": "Dify\u30c6\u30ad\u30b9\u30c8\u767b\u9332",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.1,
      "position": [
        660,
        -100
      ],
      "retryOnFail": true,
      "maxTries": 3,
      "waitBetweenTries": 5000,
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "// PDF\u306eBase64\u30c7\u30fc\u30bf\u304b\u3089Gemini API\u30ea\u30af\u30a8\u30b9\u30c8\u30da\u30a4\u30ed\u30fc\u30c9\u3092\u69cb\u7bc9\nconst item = $input.first();\nconst base64Data = item.json.base64Data;\nconst fileName = item.json.fileName;\n\nif (!base64Data) {\n  throw new Error('No base64Data found for PDF');\n}\n\n// Gemini API\u30ea\u30af\u30a8\u30b9\u30c8\u30da\u30a4\u30ed\u30fc\u30c9\u3092\u69cb\u7bc9\nconst requestPayload = {\n  contents: [{\n    parts: [\n      {\n        text: `\u3053\u306ePDF\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u306e\u5185\u5bb9\u3092\u8a73\u7d30\u306b\u62bd\u51fa\u3057\u3066\u304f\u3060\u3055\u3044\u3002\n\n## \u51fa\u529b\u5f62\u5f0f\n\u4ee5\u4e0b\u306e\u5f62\u5f0f\u3067\u69cb\u9020\u5316\u3057\u3066\u51fa\u529b\u3057\u3066\u304f\u3060\u3055\u3044\uff1a\n\n\u3010\u6587\u66f8\u7a2e\u5225\u3011\uff08\u5951\u7d04\u66f8\u3001\u898b\u7a4d\u66f8\u3001\u7269\u4ef6\u60c5\u5831\u3001\u8b70\u4e8b\u9332\u3001\u8acb\u6c42\u66f8\u3001\u305d\u306e\u4ed6\uff09\n\n\u3010\u7269\u4ef6\u60c5\u5831\u306e\u5834\u5408\u3011\u203b\u8a72\u5f53\u3059\u308b\u5834\u5408\u306e\u307f\n\u30fb\u8cc3\u6599: \u25cb\u25cb\u5186/\u6708\uff08\u7ba1\u7406\u8cbb\u8fbc\u307f\u306e\u5834\u5408\u306f\u5408\u8a08\u984d\u3082\uff09\n\u30fb\u7ba1\u7406\u8cbb/\u5171\u76ca\u8cbb: \u25cb\u25cb\u5186/\u6708\n\u30fb\u6577\u91d1: \u25cb\u30f6\u6708\u5206\n\u30fb\u793c\u91d1: \u25cb\u30f6\u6708\u5206\n\u30fb\u6240\u5728\u5730: \u90fd\u9053\u5e9c\u770c\u304b\u3089\u756a\u5730\u307e\u3067\n\u30fb\u6700\u5bc4\u99c5: \u25cb\u25cb\u7dda\u25cb\u25cb\u99c5 \u5f92\u6b69\u25cb\u5206\n\u30fb\u9593\u53d6\u308a: \u25cbLDK\u7b49\n\u30fb\u5c02\u6709\u9762\u7a4d: \u25cb\u25cb\u33a1\n\u30fb\u7bc9\u5e74\u6570: \u25cb\u5e74\uff08\u7bc9\u5e74\u6708\u304c\u308f\u304b\u308c\u3070\u8a18\u8f09\uff09\n\u30fb\u968e\u6570: \u25cb\u968e/\u25cb\u968e\u5efa\u3066\n\u30fb\u69cb\u9020: RC\u9020\u3001\u9244\u9aa8\u9020\u3001\u6728\u9020\u306a\u3069\n\u30fb\u8a2d\u5099: \u30a8\u30a2\u30b3\u30f3\u3001\u30d0\u30b9\u30c8\u30a4\u30ec\u5225\u3001\u30aa\u30fc\u30c8\u30ed\u30c3\u30af\u7b49\n\n\u3010\u91d1\u984d\u60c5\u5831\u3011\u203b\u7269\u4ef6\u4ee5\u5916\u306e\u6587\u66f8\u306e\u5834\u5408\n\u30fb\u5408\u8a08\u91d1\u984d: \u25cb\u25cb\u5186\n\u30fb\u5185\u8a33\u304c\u3042\u308c\u3070\u8a18\u8f09\n\n\u3010\u65e5\u4ed8\u60c5\u5831\u3011\n\u30fb\u4f5c\u6210\u65e5/\u5951\u7d04\u65e5/\u6709\u52b9\u671f\u9650\u306a\u3069\n\n\u3010\u95a2\u4fc2\u8005\u60c5\u5831\u3011\n\u30fb\u4f1a\u793e\u540d\u3001\u62c5\u5f53\u8005\u540d\u3001\u9023\u7d61\u5148\u306a\u3069\n\n\u3010\u6587\u66f8\u306e\u8981\u7d04\u3011\n\u5185\u5bb9\u30922-3\u6587\u3067\u8981\u7d04\n\n\u3010\u691c\u7d22\u30ad\u30fc\u30ef\u30fc\u30c9\u3011\n\u3053\u306e\u6587\u66f8\u3092\u691c\u7d22\u3059\u308b\u306e\u306b\u5f79\u7acb\u3064\u30ad\u30fc\u30ef\u30fc\u30c9\u3092\u30ab\u30f3\u30de\u533a\u5207\u308a\u3067\u5217\u6319\n\n\u3010\u5168\u6587\u30c6\u30ad\u30b9\u30c8\u3011\n\u8aad\u307f\u53d6\u308c\u308b\u5168\u3066\u306e\u6587\u5b57\u60c5\u5831\u3092\u6f0f\u308c\u306a\u304f\u8a18\u8f09`\n      },\n      {\n        inline_data: {\n          mime_type: \"application/pdf\",\n          data: base64Data\n        }\n      }\n    ]\n  }],\n  generationConfig: {\n    temperature: 0.3,\n    maxOutputTokens: 8192\n  }\n};\n\nreturn [{\n  json: {\n    ...item.json,\n    geminiPayload: requestPayload\n  }\n}];"
      },
      "id": "prepare-gemini-request",
      "name": "Gemini\u30ea\u30af\u30a8\u30b9\u30c8\u6e96\u5099",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        880,
        0
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent",
        "sendQuery": true,
        "queryParameters": {
          "parameters": [
            {
              "name": "key",
              "value": "={{ $env.GEMINI_API_KEY }}"
            }
          ]
        },
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ JSON.stringify($json.geminiPayload) }}",
        "options": {
          "timeout": 120000
        }
      },
      "id": "gemini-pdf",
      "name": "Gemini PDF\u51e6\u7406",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.1,
      "position": [
        1100,
        0
      ],
      "retryOnFail": true,
      "maxTries": 3,
      "waitBetweenTries": 5000
    },
    {
      "parameters": {
        "jsCode": "// Gemini\u306e\u5fdc\u7b54\u304b\u3089\u30c6\u30ad\u30b9\u30c8\u3092\u62bd\u51fa\nconst item = $input.first();\nconst geminiResponse = item.json;\nconst prevData = $('\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099').first().json;\n\n// Gemini\u30ec\u30b9\u30dd\u30f3\u30b9\u304b\u3089\u30c6\u30ad\u30b9\u30c8\u3092\u62bd\u51fa\nlet extractedText = '';\ntry {\n  extractedText = geminiResponse.candidates[0].content.parts[0].text || '';\n} catch (e) {\n  extractedText = '\u30c6\u30ad\u30b9\u30c8\u62bd\u51fa\u30a8\u30e9\u30fc: ' + e.message;\n}\n\n// \u30d5\u30a1\u30a4\u30ebURL\u3092\u751f\u6210\uff08Nginx\u7d4c\u7531\u3067\u30a2\u30af\u30bb\u30b9\u53ef\u80fd\uff09\nconst fileUrl = `http://localhost/docs/${encodeURI(prevData.relativePath)}`;\n\n// \u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u30c6\u30ad\u30b9\u30c8\u3092\u69cb\u7bc9\nconst parts = [];\nparts.push(`\u25a0\u30d5\u30a1\u30a4\u30eb\u540d: ${prevData.fileName}`);\nparts.push(`\u25a0\u30d5\u30a1\u30a4\u30eb\u30d1\u30b9: ${prevData.relativePath}`);\nparts.push(`\u25a0\u30d5\u30a1\u30a4\u30ebURL: ${fileUrl}`);\nparts.push('');\nparts.push('\u25a0\u62bd\u51fa\u5185\u5bb9:');\nparts.push(extractedText);\n\nconst documentText = parts.join('\\n');\n\nreturn [{\n  json: {\n    ...prevData,\n    fileUrl: fileUrl,\n    extractedText: extractedText,\n    documentText: documentText\n  }\n}];"
      },
      "id": "build-pdf-document",
      "name": "PDF\u6587\u66f8\u69cb\u7bc9",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1320,
        0
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "http://dify-api:5001/v1/datasets/b6e67602-dbd0-4777-bf44-de9ab18a8a11/document/create-by-text",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"name\": \"{{ $json.relativePath }}\",\n  \"text\": {{ JSON.stringify($json.documentText) }},\n  \"indexing_technique\": \"high_quality\",\n  \"process_rule\": {\n    \"mode\": \"automatic\"\n  }\n}",
        "options": {
          "timeout": 120000
        }
      },
      "id": "dify-upload-pdf-text",
      "name": "Dify\u306bPDF\u30c6\u30ad\u30b9\u30c8\u767b\u9332",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.1,
      "position": [
        1540,
        0
      ],
      "retryOnFail": true,
      "maxTries": 3,
      "waitBetweenTries": 5000,
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "// \u30d0\u30a4\u30ca\u30ea\u30c7\u30fc\u30bf\u306e\u30d5\u30a1\u30a4\u30eb\u540d\u3092\u30a2\u30f3\u30c0\u30fc\u30b9\u30b3\u30a2\u5f62\u5f0f\u306b\u5909\u66f4\nconst item = $input.first();\nconst relativePath = item.json.relativePath;\n\n// \u30b9\u30e9\u30c3\u30b7\u30e5\u3092\u30a2\u30f3\u30c0\u30fc\u30b9\u30b3\u30a2\u306b\u7f6e\u63db\nconst safeFileName = relativePath.replace(/\\//g, '_');\n\n// \u30d0\u30a4\u30ca\u30ea\u30c7\u30fc\u30bf\u306e\u30d5\u30a1\u30a4\u30eb\u540d\u3092\u5909\u66f4\nif (item.binary && item.binary.data) {\n  item.binary.data.fileName = safeFileName;\n}\n\nreturn [{\n  json: {\n    ...item.json,\n    safeFileName: safeFileName\n  },\n  binary: item.binary\n}];"
      },
      "id": "rename-file",
      "name": "\u30d5\u30a1\u30a4\u30eb\u540d\u5909\u66f4",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        880,
        200
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "http://dify-api:5001/v1/datasets/b6e67602-dbd0-4777-bf44-de9ab18a8a11/document/create_by_file",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendBody": true,
        "contentType": "multipart-form-data",
        "bodyParameters": {
          "parameters": [
            {
              "parameterType": "formBinaryData",
              "name": "file",
              "inputDataFieldName": "data"
            },
            {
              "parameterType": "formData",
              "name": "data",
              "value": "{\"indexing_technique\":\"high_quality\",\"process_rule\":{\"mode\":\"automatic\"}}"
            }
          ]
        },
        "options": {
          "timeout": 120000
        }
      },
      "id": "dify-upload-file",
      "name": "Dify\u30d5\u30a1\u30a4\u30eb\u767b\u9332",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.1,
      "position": [
        1100,
        200
      ],
      "retryOnFail": true,
      "maxTries": 3,
      "waitBetweenTries": 5000,
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "const item = $input.first();\nconst prevData = $('\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099').first().json;\nconst difyResponse = item.json;\n\nconst documentId = difyResponse.document?.id || null;\nconst batch = difyResponse.batch || null;\n\nreturn [{\n  json: {\n    success: true,\n    documentId: documentId,\n    batch: batch,\n    fileName: prevData.fileName,\n    originalPath: prevData.originalPath,\n    relativePath: prevData.relativePath,\n    fileType: 'text',\n    status: 'completed',\n    processedAt: new Date().toISOString()\n  }\n}];"
      },
      "id": "result-text",
      "name": "\u30c6\u30ad\u30b9\u30c8\u7d50\u679c",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        880,
        -100
      ]
    },
    {
      "parameters": {
        "jsCode": "const item = $input.first();\nconst prevData = $('\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099').first().json;\nconst difyResponse = item.json;\n\nconst documentId = difyResponse.document?.id || null;\nconst batch = difyResponse.batch || null;\n\nreturn [{\n  json: {\n    success: true,\n    documentId: documentId,\n    batch: batch,\n    fileName: prevData.fileName,\n    originalPath: prevData.originalPath,\n    relativePath: prevData.relativePath,\n    fileType: 'pdf',\n    status: 'completed',\n    processedAt: new Date().toISOString()\n  }\n}];"
      },
      "id": "result-pdf",
      "name": "PDF\u7d50\u679c",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1760,
        0
      ]
    },
    {
      "parameters": {
        "jsCode": "const item = $input.first();\nconst prevData = $('\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099').first().json;\nconst difyResponse = item.json;\n\nconst documentId = difyResponse.document?.id || null;\nconst batch = difyResponse.batch || null;\n\nreturn [{\n  json: {\n    success: true,\n    documentId: documentId,\n    batch: batch,\n    fileName: prevData.fileName,\n    originalPath: prevData.originalPath,\n    relativePath: prevData.relativePath,\n    fileType: 'binary',\n    status: 'completed',\n    processedAt: new Date().toISOString()\n  }\n}];"
      },
      "id": "result-file",
      "name": "\u30d5\u30a1\u30a4\u30eb\u7d50\u679c",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1320,
        200
      ]
    }
  ],
  "connections": {
    "\u30ef\u30fc\u30af\u30d5\u30ed\u30fc\u958b\u59cb": {
      "main": [
        [
          {
            "node": "\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\u30c9\u30ad\u30e5\u30e1\u30f3\u30c8\u6e96\u5099": {
      "main": [
        [
          {
            "node": "\u30c6\u30ad\u30b9\u30c8\u5224\u5b9a",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\u30c6\u30ad\u30b9\u30c8\u5224\u5b9a": {
      "main": [
        [
          {
            "node": "Dify\u30c6\u30ad\u30b9\u30c8\u767b\u9332",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "PDF\u5224\u5b9a",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "PDF\u5224\u5b9a": {
      "main": [
        [
          {
            "node": "Gemini\u30ea\u30af\u30a8\u30b9\u30c8\u6e96\u5099",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "\u30d5\u30a1\u30a4\u30eb\u540d\u5909\u66f4",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Gemini\u30ea\u30af\u30a8\u30b9\u30c8\u6e96\u5099": {
      "main": [
        [
          {
            "node": "Gemini PDF\u51e6\u7406",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Dify\u30c6\u30ad\u30b9\u30c8\u767b\u9332": {
      "main": [
        [
          {
            "node": "\u30c6\u30ad\u30b9\u30c8\u7d50\u679c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Gemini PDF\u51e6\u7406": {
      "main": [
        [
          {
            "node": "PDF\u6587\u66f8\u69cb\u7bc9",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "PDF\u6587\u66f8\u69cb\u7bc9": {
      "main": [
        [
          {
            "node": "Dify\u306bPDF\u30c6\u30ad\u30b9\u30c8\u767b\u9332",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Dify\u306bPDF\u30c6\u30ad\u30b9\u30c8\u767b\u9332": {
      "main": [
        [
          {
            "node": "PDF\u7d50\u679c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "\u30d5\u30a1\u30a4\u30eb\u540d\u5909\u66f4": {
      "main": [
        [
          {
            "node": "Dify\u30d5\u30a1\u30a4\u30eb\u767b\u9332",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Dify\u30d5\u30a1\u30a4\u30eb\u767b\u9332": {
      "main": [
        [
          {
            "node": "\u30d5\u30a1\u30a4\u30eb\u7d50\u679c",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1"
  },
  "tags": [
    "docusearch",
    "document-processing"
  ]
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Document Processing - DocuSearch. Uses executeWorkflowTrigger, httpRequest. Event-driven trigger; 14 nodes.

Source: https://github.com/taiki-aoi/DocuSearch_AI/blob/b341b84f7e1aacb7860e63149ca8759c4d260d52/n8n/workflows/document-processing.json — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This template is a powerful, reusable utility for managing stateful, long-running processes. It allows a main workflow to be paused indefinitely at "checkpoints" and then be resumed by external, async

HTTP Request, Execute Workflow Trigger
Web Scraping

Upload files from any source to your account Kommo or AmoCRM with a simple and reusable workflow. It can split a large file into small ones and upload chunks. Works for Kommo and amoCRM There are 3 re

HTTP Request, Execute Workflow Trigger, Stop And Error
Web Scraping

Remixed Backup your workflows to GitHub from Solomon's work. Check out his templates.

HTTP Request, GitHub, Execute Workflow Trigger +1
Web Scraping

Remixed Backup your workflows to GitHub from Solomon's work. Check out his templates.

Execute Workflow Trigger, HTTP Request, GitHub
Web Scraping

This workflow audits your SharePoint Online environment for external sharing risks by identifying files and folders that are shared with anonymous links or external/guest users. It is designed to trav

HTTP Request, Execute Workflow Trigger