AutomationFlowsAI & RAG › Extract & Curate Newsletter PDFs from Gmail

Extract & Curate Newsletter PDFs from Gmail

Original n8n title: Newsletter PDF Extractor & Curator

Newsletter PDF Extractor & Curator. Uses gmailTrigger, n8n-nodes-pdfvector, googleSheets, slack. Event-driven trigger; 9 nodes.

Event trigger★★★★☆ complexity9 nodesGmail TriggerN8N Nodes PdfvectorGoogle SheetsSlack
AI & RAG Trigger: Event Nodes: 9 Complexity: ★★★★☆ Added:

This workflow follows the Gmail Trigger → Google Sheets recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Newsletter PDF Extractor & Curator",
  "nodes": [
    {
      "parameters": {
        "content": "## \ud83d\udcf0 Newsletter PDF Extractor & Curator\n\n### What this workflow does\n1. Watches Gmail for emails with PDF attachments\n2. Downloads the PDF attachment\n3. Uses PDF Vector to analyze the content\n4. Extracts topics, stories, stats, and trends\n5. Logs summary to Google Sheets\n6. Shares digest to Slack\n\n### Setup steps\n1. Connect Gmail account (OAuth2)\n2. Get PDF Vector API key from pdfvector.com/api-keys\n3. Create Google Sheet with columns:\n   Newsletter, Subject, Sender, Received Date, Topics, Key Stories, Stats & Data, Trends, Full Analysis, Processed Date\n4. Update spreadsheet ID in Google Sheets node\n5. Connect Slack and set content channel\n\n### Customize the Gmail filter\nEdit the search query to match your newsletters:\n- has:attachment filename:pdf\n- from:(example.com) has:attachment\n\n### Perfect for\n- Industry reports & whitepapers\n- PDF newsletters & digests\n- Research publications",
        "height": 580,
        "width": 380,
        "color": 5
      },
      "id": "4b72a33e-1fdf-4865-a925-6534f1d050b8",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        -816,
        960
      ]
    },
    {
      "parameters": {
        "content": "## \ud83d\udcca Extracted Info\n\n- Newsletter name\n- Main topics covered\n- Key stories & summaries\n- Statistics & data points\n- Emerging trends",
        "height": 180,
        "width": 200
      },
      "id": "c35a3db8-93c5-49e8-92ac-f71f44332db1",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "typeVersion": 1,
      "position": [
        0,
        912
      ]
    },
    {
      "parameters": {
        "pollTimes": {
          "item": [
            {
              "mode": "everyHour"
            }
          ]
        },
        "simple": false,
        "filters": {
          "includeSpamTrash": false
        },
        "options": {
          "downloadAttachments": true
        }
      },
      "id": "6e5601e9-34df-4c5e-8865-6f0e870519b9",
      "name": "Gmail Trigger",
      "type": "n8n-nodes-base.gmailTrigger",
      "typeVersion": 1.1,
      "position": [
        -400,
        1200
      ],
      "credentials": {
        "gmailOAuth2": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "conditions": {
          "options": {
            "caseSensitive": true,
            "leftValue": "",
            "typeValidation": "strict"
          },
          "conditions": [
            {
              "id": "has-pdf",
              "leftValue": "={{ $binary && Object.keys($binary).length > 0 && Object.values($binary).some(att => att.mimeType === 'application/pdf') }}",
              "rightValue": true,
              "operator": {
                "type": "boolean",
                "operation": "equals"
              }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "3e26ca13-8e5d-496a-bd80-c44bffa31c1b",
      "name": "Has PDF Attachment?",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2.2,
      "position": [
        -208,
        1200
      ]
    },
    {
      "parameters": {
        "jsCode": "const email = $input.first().json;\nconst binary = $input.first().binary || {};\n\n// Find the first PDF attachment (handles various naming patterns)\nlet pdfKey = null;\nlet pdfFileName = 'attachment.pdf';\n\nfor (const key of Object.keys(binary)) {\n  if (binary[key] && binary[key].mimeType === 'application/pdf') {\n    pdfKey = key;\n    pdfFileName = binary[key].fileName || 'attachment.pdf';\n    break;\n  }\n}\n\nif (!pdfKey) {\n  throw new Error('No PDF attachment found in email');\n}\n\n// Get sender info\nlet senderName = 'Unknown';\nlet senderEmail = 'unknown@example.com';\n\nif (email.from) {\n  if (email.from.value && email.from.value[0]) {\n    senderName = email.from.value[0].name || email.from.value[0].address;\n    senderEmail = email.from.value[0].address;\n  } else if (typeof email.from === 'string') {\n    senderEmail = email.from;\n    senderName = email.from.split('@')[0];\n  }\n}\n\nreturn [{\n  json: {\n    emailId: email.id,\n    subject: email.subject || 'No Subject',\n    senderName: senderName,\n    senderEmail: senderEmail,\n    receivedDate: email.date || new Date().toISOString(),\n    pdfFileName: pdfFileName,\n    pdfKey: pdfKey\n  },\n  binary: {\n    data: binary[pdfKey]\n  }\n}];"
      },
      "id": "0be04adf-599c-4e1f-ad1f-527440a23f4c",
      "name": "Extract PDF Attachment",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        0,
        1104
      ]
    },
    {
      "parameters": {
        "operation": "ask",
        "inputType": "file",
        "question": "Analyze this newsletter/document and provide a structured summary:\n\n1. DOCUMENT INFO: What is this document? Who published it?\n\n2. MAIN TOPICS (list 3-5): What are the primary subjects covered?\n\n3. KEY STORIES: Summarize the 2-3 most important stories or sections in 1-2 sentences each.\n\n4. STATS & DATA: List any specific numbers, statistics, or data points mentioned.\n\n5. TRENDS: What emerging trends or themes does this document highlight?\n\nFormat your response with clear section headers."
      },
      "id": "abcbe0d0-56d3-4b23-a7a8-a01d4e5d2c04",
      "name": "PDF Vector - Analyze Content",
      "type": "n8n-nodes-pdfvector.pdfVector",
      "typeVersion": 2,
      "position": [
        208,
        1104
      ]
    },
    {
      "parameters": {
        "jsCode": "const emailData = $('Extract PDF Attachment').item.json;\nconst analysis = $input.first().json.markdown || $input.first().json.answer || '';\n\n// Parse sections from analysis\nfunction extractSection(text, patterns) {\n  for (const pattern of patterns) {\n    const regex = new RegExp(pattern + '[:\\\\s]*([\\\\s\\\\S]*?)(?=\\\\n\\\\d\\\\.|\\\\n[A-Z]{2,}[:\\s]|$)', 'i');\n    const match = text.match(regex);\n    if (match && match[1]) {\n      return match[1].trim().substring(0, 500);\n    }\n  }\n  return 'N/A';\n}\n\nconst topics = extractSection(analysis, ['MAIN TOPICS', 'TOPICS', '2\\\\.']);\nconst keyStories = extractSection(analysis, ['KEY STORIES', 'STORIES', '3\\\\.']);\nconst stats = extractSection(analysis, ['STATS', 'DATA', 'STATISTICS', '4\\\\.']);\nconst trends = extractSection(analysis, ['TRENDS', 'EMERGING', '5\\\\.']);\n\n// Truncate full analysis for Sheets\nconst shortAnalysis = analysis.substring(0, 1000);\n\nreturn [{\n  json: {\n    newsletterName: emailData.senderName,\n    subject: emailData.subject,\n    senderEmail: emailData.senderEmail,\n    receivedDate: emailData.receivedDate,\n    pdfFileName: emailData.pdfFileName,\n    topics: topics,\n    keyStories: keyStories,\n    stats: stats,\n    trends: trends,\n    fullAnalysis: shortAnalysis,\n    analysisComplete: analysis,\n    processedAt: new Date().toISOString()\n  }\n}];"
      },
      "id": "8675204a-eb91-4982-9b3b-11ee2d280468",
      "name": "Format Analysis",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        400,
        1104
      ]
    },
    {
      "parameters": {
        "operation": "append",
        "documentId": {
          "__rl": true,
          "value": "YOUR_SPREADSHEET_ID",
          "mode": "list",
          "cachedResultName": "Newsletter Log"
        },
        "sheetName": {
          "__rl": true,
          "value": "gid=0",
          "mode": "list",
          "cachedResultName": "Newsletters"
        },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "Newsletter": "={{ $json.newsletterName }}",
            "Subject": "={{ $json.subject }}",
            "Sender": "={{ $json.senderEmail }}",
            "Received Date": "={{ $json.receivedDate }}",
            "Topics": "={{ $json.topics }}",
            "Key Stories": "={{ $json.keyStories }}",
            "Stats & Data": "={{ $json.stats }}",
            "Trends": "={{ $json.trends }}",
            "Full Analysis": "={{ $json.fullAnalysis }}",
            "Processed Date": "={{ $json.processedAt.split('T')[0] }}"
          },
          "matchingColumns": [],
          "schema": [
            {
              "id": "Newsletter",
              "displayName": "Newsletter",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Subject",
              "displayName": "Subject",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Sender",
              "displayName": "Sender",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Received Date",
              "displayName": "Received Date",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Topics",
              "displayName": "Topics",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Key Stories",
              "displayName": "Key Stories",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Stats & Data",
              "displayName": "Stats & Data",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Trends",
              "displayName": "Trends",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Full Analysis",
              "displayName": "Full Analysis",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            },
            {
              "id": "Processed Date",
              "displayName": "Processed Date",
              "required": false,
              "defaultMatch": false,
              "display": true,
              "type": "string",
              "canBeUsedToMatch": true
            }
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {}
      },
      "id": "04aea0fa-e4cf-4501-aeaf-d9c993fcd1f6",
      "name": "Log Newsletter",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.4,
      "position": [
        608,
        1104
      ]
    },
    {
      "parameters": {
        "authentication": "oAuth2",
        "select": "channel",
        "channelId": {
          "__rl": true,
          "value": "YOUR_SLACK_CHANNEL_ID",
          "mode": "list",
          "cachedResultName": "content-digest"
        },
        "text": "=\ud83d\udcf0 *Newsletter PDF Digest*\n\n*From:* {{ $('Format Analysis').item.json.newsletterName }}\n*Subject:* {{ $('Format Analysis').item.json.subject }}\n*File:* {{ $('Format Analysis').item.json.pdfFileName }}\n\n---\n\n\ud83d\udccb *Topics Covered:*\n{{ $('Format Analysis').item.json.topics }}\n\n---\n\n\ud83d\udcd6 *Key Stories:*\n{{ $('Format Analysis').item.json.keyStories }}\n\n---\n\n\ud83d\udcca *Stats & Data:*\n{{ $('Format Analysis').item.json.stats }}\n\n---\n\n\ud83d\udcc8 *Trends:*\n{{ $('Format Analysis').item.json.trends }}",
        "otherOptions": {}
      },
      "id": "1cfcccbe-d733-4d33-978b-5c23d4377388",
      "name": "Share Digest",
      "type": "n8n-nodes-base.slack",
      "typeVersion": 2.2,
      "position": [
        800,
        1104
      ]
    }
  ],
  "connections": {
    "Gmail Trigger": {
      "main": [
        [
          {
            "node": "Has PDF Attachment?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Has PDF Attachment?": {
      "main": [
        [
          {
            "node": "Extract PDF Attachment",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract PDF Attachment": {
      "main": [
        [
          {
            "node": "PDF Vector - Analyze Content",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "PDF Vector - Analyze Content": {
      "main": [
        [
          {
            "node": "Format Analysis",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format Analysis": {
      "main": [
        [
          {
            "node": "Log Newsletter",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Log Newsletter": {
      "main": [
        [
          {
            "node": "Share Digest",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1",
    "availableInMCP": false
  },
  "meta": {
    "templateCredsSetupCompleted": false
  },
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Newsletter PDF Extractor & Curator. Uses gmailTrigger, n8n-nodes-pdfvector, googleSheets, slack. Event-driven trigger; 9 nodes.

Source: https://github.com/khanhduyvt0101/workflows/blob/0153ee2efc0f692c931b9bb4c2a04abf11756822/n8n-workflows/newsletter-pdf-extractor.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Extract title deed data and score risk factors with AI. Uses googleDriveTrigger, googleDrive, n8n-nodes-pdfvector, googleSheets. Event-driven trigger; 10 nodes.

Google Drive Trigger, Google Drive, N8N Nodes Pdfvector +2
AI & RAG

W11 - Meeting Notes & Action Item Extractor. Uses googleDriveTrigger, googleDrive, n8n-nodes-pdfvector, googleSheets. Event-driven trigger; 9 nodes.

Google Drive Trigger, Google Drive, N8N Nodes Pdfvector +2
AI & RAG

Invoice Data Extraction Automation. Uses googleDriveTrigger, googleDrive, n8n-nodes-pdfvector, googleSheets. Event-driven trigger; 8 nodes.

Google Drive Trigger, Google Drive, N8N Nodes Pdfvector +2
AI & RAG

Customer Feedback Loop Analyzer. Uses formTrigger, lmChatGoogleGemini, gmail, slack. Event-driven trigger; 11 nodes.

Form Trigger, Google Gemini Chat, Gmail +4
AI & RAG

13. Insurance Pre-Authorization. Uses gmailTrigger, gmail, n8n-nodes-pdfvector, googleSheets. Event-driven trigger; 12 nodes.

Gmail Trigger, Gmail, N8N Nodes Pdfvector +2