{
  "id": "y0Yk7da21T4u9zlp",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Medical Research Tracker with Email and Pipedrive",
  "tags": [],
  "nodes": [
    {
      "id": "b27fe227-cf59-4449-ab53-3e3d2f2171f9",
      "name": "Start Workflow",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        608,
        400
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "bc36271b-f3b8-4c96-86f6-e3b60279e098",
      "name": "Define Sources",
      "type": "n8n-nodes-base.code",
      "position": [
        800,
        400
      ],
      "parameters": {
        "jsCode": "// List of regulatory news pages\nconst urls = [\n  { url: 'https://www.sec.gov/news/pressreleases', source: 'SEC Press Releases' },\n  { url: 'https://www.fca.org.uk/news', source: 'UK FCA News' },\n  { url: 'https://www.esma.europa.eu/press-news/esma-news', source: 'ESMA News' }\n];\nreturn urls.map(item => ({ json: item }));"
      },
      "typeVersion": 2
    },
    {
      "id": "46684705-ff32-4765-9e4b-8337153c7052",
      "name": "Split Sources",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        1008,
        400
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "a724832a-25cd-4b60-8019-d7fbbe425966",
      "name": "Scrape Regulatory Data",
      "type": "n8n-nodes-scrapegraphai.scrapegraphAi",
      "position": [
        1232,
        400
      ],
      "parameters": {
        "userPrompt": "Extract all recent regulatory news items from the page. For each item return JSON with: title, date, summary, url, sourceName = \"{{$json.source}}\". Limit to the first 10 items.",
        "websiteUrl": "={{ $json.url }}"
      },
      "typeVersion": 1
    },
    {
      "id": "b65f46e9-ff75-4163-9ff4-bc8e82cc8ca1",
      "name": "Merge Results",
      "type": "n8n-nodes-base.merge",
      "position": [
        1456,
        400
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "mergeByFields": {
          "values": [
            {}
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "04f30659-7a51-49fc-abfe-b72d961355ae",
      "name": "Format & Deduplicate",
      "type": "n8n-nodes-base.code",
      "position": [
        1744,
        400
      ],
      "parameters": {
        "jsCode": "// Normalise, tag and add IDs\nconst items = $input.all();\nconst keyword = /(rule|regulation|directive|act)/i;\nreturn items.flatMap(item => {\n  const dataArray = Array.isArray(item.json) ? item.json : [item.json];\n  return dataArray.map(d => {\n    const title = d.title || d.headline || '';\n    const summary = d.summary || d.description || '';\n    const id = Buffer.from((d.url || title).toString()).toString('base64').slice(0,24);\n    return {\n      json: {\n        ...d,\n        dedupId: id,\n        isImportant: keyword.test(title) || keyword.test(summary),\n        scrapedAt: new Date().toISOString(),\n        source: d.sourceName || item.json.source || 'unknown'\n      }\n    };\n  });\n});"
      },
      "typeVersion": 2
    },
    {
      "id": "0f55bce8-2288-4565-a199-635232665f3c",
      "name": "New Important Update?",
      "type": "n8n-nodes-base.if",
      "position": [
        1920,
        400
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "boolean": [
            {
              "value1": "={{ $json.isImportant }}",
              "operation": "true"
            }
          ]
        }
      },
      "typeVersion": 2
    },
    {
      "id": "601eaecd-bbd7-4b54-ae54-867da7bd8d37",
      "name": "Prepare Telegram Message",
      "type": "n8n-nodes-base.set",
      "position": [
        2256,
        320
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "2b2eb5b6-bdb4-4c02-a4fd-a8280494b333",
      "name": "Send Telegram Alert",
      "type": "n8n-nodes-base.telegram",
      "position": [
        2448,
        336
      ],
      "parameters": {
        "text": "={{ $json.text }}",
        "chatId": "={{ $env.TELEGRAM_CHAT_ID || 'YOUR_CHAT_ID' }}",
        "additionalFields": {
          "parse_mode": "Markdown",
          "disable_notification": false
        }
      },
      "typeVersion": 1
    },
    {
      "id": "dc3f2aa2-e94a-4a88-994b-bb842fbf3355",
      "name": "Save to Redis",
      "type": "n8n-nodes-base.redis",
      "position": [
        2288,
        528
      ],
      "parameters": {
        "key": "={{ 'reg_update:' + $json.dedupId }}",
        "ttl": 604800,
        "value": "={{ JSON.stringify($json) }}",
        "expire": true,
        "operation": "set"
      },
      "typeVersion": 1
    },
    {
      "id": "268b1249-b6ef-4bfe-8156-a7780ab4a72c",
      "name": "Error Handler",
      "type": "n8n-nodes-base.code",
      "position": [
        1440,
        576
      ],
      "parameters": {
        "jsCode": "const err = $input.all()[0].json;\nreturn [{ json: { error: true, message: err.message || err, time: new Date().toISOString() } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "d4a94949-a6d1-4a5a-a520-41cb7a723409",
      "name": "Workflow Overview",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -96,
        0
      ],
      "parameters": {
        "width": 550,
        "height": 770,
        "content": "## How it works\n\nThis workflow lets compliance analysts manually launch a daily sweep across several official regulatory news feeds. When you click \u2018Execute\u2019, a Code node produces a list of URLs for the SEC, FCA and ESMA press-release pages (add more if you monitor other jurisdictions). The list is split so each page is scraped in its own run of ScrapeGraphAI. The AI extracts headline, date, summary and link for every notice it finds. A Merge node waits until all sources finish, then a follow-up Code node normalises the data, flags posts that look like new rules or directives and assigns a unique hash. All items are stored in Redis for seven days so you keep a lightweight archive, while anything flagged as important triggers an instant Telegram message.\n\n## Setup steps\n\n1. Add a ScrapeGraphAI API credential  \n2. Add a Redis credential (ensure write access)  \n3. Add a Telegram Bot credential and set your chat ID  \n4. Open the \u201cDefine Sources\u201d Code node and list all URLs you need  \n5. Optional: tweak keyword list in \u201cFormat & Deduplicate\u201d  \n6. Click Execute to test, then enable scheduling externally if desired"
      },
      "typeVersion": 1
    },
    {
      "id": "05d24e6b-a802-4f0f-90e1-5920f27edb76",
      "name": "Section \u2013 Source & Split",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        656,
        -48
      ],
      "parameters": {
        "color": 7,
        "width": 482,
        "height": 766,
        "content": "## Source & Split\n\nThis section contains the Manual Trigger, **Define Sources** Code node and **Split Sources** node. The trigger is kept manual to simplify testing; you can connect a Schedule Trigger later if you want unattended runs. The code node returns an array of JavaScript objects where each object holds the URL of an official regulator press-release page and a human-readable label. Because the URLs live in code it is trivial to add or remove sources without touching any downstream logic. The **Split in Batches** node then iterates through that array with a batch size of one. n8n processes each batch independently, so slow or failing pages will not block the entire job. Running sources individually also gives ScrapeGraphAI a predictable, single-URL payload which keeps memory overhead low and makes troubleshooting easier. Together these nodes prepare a clean, one-by-one stream of inputs for the AI scraper that follows."
      },
      "typeVersion": 1
    },
    {
      "id": "45fadc37-9525-4cd5-89df-16b33b7ed8c6",
      "name": "Section \u2013 Scraping Layer",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1168,
        -112
      ],
      "parameters": {
        "color": 7,
        "width": 450,
        "height": 862,
        "content": "## Scraping Layer\n\nThe **Scrape Regulatory Data** node is the heart of the workflow. For every incoming URL it calls ScrapeGraphAI with a concise yet specific prompt instructing the model to extract headline, publication date, summary paragraph and canonical link. Because the node receives only one URL at a time, it can devote the maximum timeout to a single site without starving others. The LLM approach means you are not writing fragile CSS selectors; minor layout tweaks on government portals rarely break the extraction. Error handling is wired through the node\u2019s built-in error output and funnels directly to an **Error Handler** Code node, ensuring that individual failures are recorded without halting the job. Successful responses pass to the Merge node where they will be queued until all parallel scrapers complete. Keeping the scraping concerns isolated in this dedicated layer makes the rest of the workflow easier to maintain and extend."
      },
      "typeVersion": 1
    },
    {
      "id": "812b7b2d-5a21-46f6-8112-fb905ef1024d",
      "name": "Section \u2013 Aggregation & Processing",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1712,
        -128
      ],
      "parameters": {
        "color": 7,
        "width": 450,
        "height": 782,
        "content": "## Aggregation & Processing\n\nOnce each source has finished scraping, their results converge in the **Merge Results** node which waits until all input streams arrive. This guarantees that downstream logic always receives a complete picture of the regulatory landscape for that execution. The subsequent **Format & Deduplicate** Code node does four important things: it standardises field names so every item has `title`, `date`, `summary` and `url`; it attaches an ISO time stamp; it builds a 24-character base64 hash that acts as a deduplication key; and it tags the record as important if keywords like \u201crule\u201d or \u201cdirective\u201d appear. The **New Important Update?** IF node then branches the flow. Items flagged important head toward Telegram for immediate alerting, while all items\u2014regardless of importance\u2014continue to storage. This separation lets you fine-tune what gets pushed to noisy channels without losing the full history in your database."
      },
      "typeVersion": 1
    },
    {
      "id": "27d627c3-8dc3-469a-b2c8-fb44cf28c5a9",
      "name": "Section \u2013 Storage & Alerts",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2176,
        -112
      ],
      "parameters": {
        "color": 7,
        "width": 450,
        "height": 782,
        "content": "## Storage & Alerts\n\nThe last cluster includes **Save to Redis**, **Prepare Telegram Message** and **Send Telegram Alert**. Redis is chosen because it is lightning fast and well-suited for simple key-value caching. Each update is stored under the key `reg_update:<hash>` with a seven-day TTL so your datastore remains compact while still offering a rolling archive for audit purposes. Storing all records\u2014even ones that are not flagged important\u2014means you can build dashboards or run later analytics without scraping again. For urgent items, a Set node assembles a concise Markdown message that gives the compliance team everything they need at a glance: source, headline, date and link. The Telegram node posts this straight into your chosen chat, leveraging Telegram\u2019s real-time push notifications and ubiquitous mobile apps. Because storage and alerting are separate branches, you can extend either one\u2014such as adding a database sink or alternate alert channel\u2014without affecting the other."
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "f9fb96f6-e6d2-42ce-9253-40c92e1f031a",
  "connections": {
    "Merge Results": {
      "main": [
        [
          {
            "node": "Format & Deduplicate",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Sources": {
      "main": [
        [
          {
            "node": "Scrape Regulatory Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Define Sources": {
      "main": [
        [
          {
            "node": "Split Sources",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Start Workflow": {
      "main": [
        [
          {
            "node": "Define Sources",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Format & Deduplicate": {
      "main": [
        [
          {
            "node": "New Important Update?",
            "type": "main",
            "index": 0
          },
          {
            "node": "Save to Redis",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "New Important Update?": {
      "main": [
        [
          {
            "node": "Prepare Telegram Message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Scrape Regulatory Data": {
      "main": [
        [
          {
            "node": "Merge Results",
            "type": "main",
            "index": 0
          },
          {
            "node": "Merge Results",
            "type": "main",
            "index": 1
          },
          {
            "node": "Error Handler",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Prepare Telegram Message": {
      "main": [
        [
          {
            "node": "Send Telegram Alert",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}