AutomationFlowsAI & RAG › Daily AI Lead Discovery Engine

Daily AI Lead Discovery Engine

Original n8n title: Founder's Discovery Engine

Founder's Discovery Engine. Uses googleSheets, googleDrive, httpRequest, gmail. Scheduled trigger; 18 nodes.

Cron / scheduled trigger★★★★☆ complexity18 nodesGoogle SheetsGoogle DriveHTTP RequestGmail
AI & RAG Trigger: Cron / scheduled Nodes: 18 Complexity: ★★★★☆ Added:

This workflow follows the Gmail → Google Drive recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "Founder's Discovery Engine",
  "nodes": [
    {
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "cronExpression",
              "expression": "0 13 * * *"
            }
          ]
        }
      },
      "id": "trigger-cron",
      "name": "Cron \u00b7 Daily 7am MDT",
      "type": "n8n-nodes-base.scheduleTrigger",
      "typeVersion": 1.2,
      "position": [
        240,
        200
      ],
      "notes": "Wakes the agent up daily at 7am Boulder/Denver (13:00 UTC during MDT). Adjust UTC values for your timezone \u2014 PT=14, ET=11, UK=06, CET=05."
    },
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "discovery-engine-manual",
        "responseMode": "lastNode"
      },
      "id": "trigger-manual",
      "name": "Manual \u00b7 Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2,
      "position": [
        240,
        360
      ],
      "notes": "Manual trigger for testing and live demos. Hit the webhook URL to fire the agent on demand."
    },
    {
      "parameters": {
        "documentId": {
          "__rl": true,
          "value": "REPLACE_WITH_YOUR_SHEET_ID",
          "mode": "id"
        },
        "sheetName": {
          "__rl": true,
          "value": "ICP",
          "mode": "name"
        },
        "options": {}
      },
      "id": "read-icp",
      "name": "Read ICP from Sheets",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.5,
      "position": [
        460,
        280
      ],
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "notes": "CONFIG: ICP tab columns are icp_description, signal_keywords, subreddits (comma-separated). Edit the Sheet to change agent behavior \u2014 no redeploy needed."
    },
    {
      "parameters": {
        "operation": "download",
        "fileId": {
          "__rl": true,
          "value": "REPLACE_WITH_VOICE_MD_FILE_ID",
          "mode": "id"
        },
        "options": {
          "binaryPropertyName": "voiceMd",
          "googleFileConversion": {
            "conversion": {
              "docsToFormat": "text/plain"
            }
          }
        }
      },
      "id": "read-voice",
      "name": "Read voice.md from Drive",
      "type": "n8n-nodes-base.googleDrive",
      "typeVersion": 3,
      "position": [
        460,
        460
      ],
      "credentials": {
        "googleDriveOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "notes": "Pulls the voice.md tone-of-voice file. Edit voice.md in Drive and the agent inherits the change next run."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// HN + Reddit discovery. ICP-agnostic: context filter derives from the user's\n// own signal_keywords plus a generic-pain vocabulary. No sales jargon hardcoded.\n\nconst icp = $input.first().json;\n\nconst STOP_WORDS = new Set([\n  'the','a','an','and','or','but','for','with','too','very',\n  'of','in','on','at','to','is','our','we','i','my','your',\n  'that','this','these','those','it','as','by','from','be','are','was'\n]);\n\n// Generic pain/intent vocabulary \u2014 domain-agnostic.\n// Drops a Windows firewall comment that just mentions \"outbound\" because\n// it has none of these signals; keeps a \"n8n is too expensive\" complaint.\nconst GENERIC_PAIN = [\n  'expensive','cheap','overpriced','too much','too many',\n  'alternative','replace','replaced','switching','switched','migrating','migrated',\n  'hate','sucks','annoying','frustrating','frustrated',\n  'broken','issue','problem','bug','stuck','headache',\n  'wish','need','want','looking for','any tool','any way','anyone','tried','tried out',\n  'recommend','suggest','evaluating','comparing','vs ','versus',\n  'cost','pricing','price','bill','invoice','seat','per user','per month','quota','credits',\n  'self-host','self host','open source','open-source','free tier'\n];\n\nconst phrases = (icp.signal_keywords || '')\n  .split(',').map(s => s.trim()).filter(Boolean);\n\n// Build MULTI-WORD search queries. HN Algolia and Reddit both treat\n// space as implicit AND, so a query like \"hired SDR\" only returns posts\n// containing BOTH words \u2014 drastically better precision than \"hired\" alone.\n//\n// Strategy per phrase:\n//   - Drop stop words\n//   - Keep up to 2 most-distinctive tokens (longer = more distinctive proxy)\n//   - If only 1 token remains and it's <5 chars or a generic verb, skip\n//     (prevents searching for \"hired\", \"replaced\", \"running\" alone)\nconst GENERIC_VERBS = new Set([\n  'hired','replaced','running','looking','tried','using','used','want','need',\n  'made','built','building','found','find','got','get','make','done','have'\n]);\n\nfunction buildQuery(phrase) {\n  const tokens = phrase.split(/\\s+/)\n    .map(t => t.replace(/[,.;]+$/, ''))\n    .filter(t => t.length > 2 && !STOP_WORDS.has(t.toLowerCase()));\n  if (tokens.length === 0) return null;\n  if (tokens.length === 1) {\n    const t = tokens[0];\n    // Reject solo generic verbs and short common words\n    if (GENERIC_VERBS.has(t.toLowerCase())) return null;\n    if (t.length < 5 && !/[A-Z]/.test(t) && !/\\d/.test(t)) return null;\n    return t;\n  }\n  // Sort by distinctiveness: contains digit/dot/capital first, then by length desc\n  const sorted = [...tokens].sort((a, b) => {\n    const aD = (/[0-9.]/.test(a) ? 2 : 0) + (/[A-Z]/.test(a) ? 1 : 0);\n    const bD = (/[0-9.]/.test(b) ? 2 : 0) + (/[A-Z]/.test(b) ? 1 : 0);\n    if (aD !== bD) return bD - aD;\n    return b.length - a.length;\n  });\n  return sorted.slice(0, 2).join(' ');\n}\n\nconst queries = [];\nfor (const phrase of phrases) {\n  const q = buildQuery(phrase);\n  if (q && !queries.includes(q)) queries.push(q);\n}\nconst keywords = queries.slice(0, 8);\n\n// For quality filter: collect every non-stop token from ICP phrases as required-context vocab.\nconst icpContextSet = new Set();\nfor (const phrase of phrases) {\n  const tokens = phrase.toLowerCase().split(/\\s+/).filter(w => w.length > 2 && !STOP_WORDS.has(w));\n  for (const t of tokens) icpContextSet.add(t);\n}\nconst CONTEXT_WORDS = Array.from(new Set([...icpContextSet, ...GENERIC_PAIN]));\n\nconst subreddits = (icp.Subreddits || icp.subreddits || 'SaaS,Entrepreneur,AI_Agents,ChatGPTCoding,LocalLLaMA')\n  .split(',').map(s => s.trim()).filter(Boolean);\n\nconst diag = {\n  _diagnostic: true,\n  search_terms: keywords,\n  context_words_sample: CONTEXT_WORDS.slice(0, 12),\n  subreddits: subreddits,\n  hn_raw: 0, hn_filtered: 0,\n  reddit_raw: 0, reddit_filtered: 0,\n  results_per_keyword: {},\n  errors: []\n};\n\nif (keywords.length === 0) {\n  diag.step = 'no_keywords';\n  return [{ json: diag }];\n}\n\nconst thirtyDaysAgo = Math.floor(Date.now() / 1000) - (30 * 24 * 3600);\nconst SKIP_DOMAINS = ['ycombinator.com','news.ycombinator.com','reddit.com','old.reddit.com','redd.it','twitter.com','x.com','linkedin.com','github.com','medium.com','substack.com','youtube.com','youtu.be'];\n\nfunction isUsefulHost(host) {\n  if (!host) return false;\n  return !SKIP_DOMAINS.some(d => host === d || host.endsWith('.' + d));\n}\n\n// A multi-word keyword like \"hired SDR\" matches iff EVERY word is present in text.\nfunction keywordMatches(keyword, lowerText) {\n  const words = keyword.toLowerCase().split(/\\s+/);\n  return words.every(w => lowerText.includes(w));\n}\n\nfunction passesQualityCheck(text) {\n  if (!text) return false;\n  const lower = text.toLowerCase();\n  if (text.length < 80) return false;\n\n  const matchedQueries = keywords.filter(k => keywordMatches(k, lower));\n  if (matchedQueries.length === 0) return false;\n\n  // Any multi-word query that fully matched = strong signal (API already ANDed)\n  if (matchedQueries.some(k => k.includes(' '))) return true;\n\n  // 2+ distinct single-word queries matched = strong\n  if (matchedQueries.length >= 2) return true;\n\n  // Single word match \u2014 require ICP-derived context or generic pain signal\n  const hasContext = CONTEXT_WORDS.some(c => lower.includes(c));\n  return hasContext;\n}\n\nfunction deriveSignalType(text, isShowOrAskPost) {\n  const lower = text.toLowerCase();\n  if (lower.includes('hiring') || lower.includes('hired ')) return 'hiring';\n  if (lower.includes('expensive') || lower.includes('overpriced') || lower.includes('too much') || lower.includes('rip off') || lower.includes('hate')) return 'complaint';\n  if (isShowOrAskPost || lower.includes('alternative') || lower.includes('looking for') || lower.includes('anyone') || lower.includes('recommend') || lower.includes('any tool')) return 'tool_ask';\n  return 'pain';\n}\n\nfunction computeScore(text, points, hasCompanyUrl) {\n  const lower = text.toLowerCase();\n  const matchedQueries = keywords.filter(k => keywordMatches(k, lower));\n  const multiWordHits = matchedQueries.filter(k => k.includes(' ')).length;\n  const singleWordHits = matchedQueries.filter(k => !k.includes(' ')).length;\n  const contextHits = CONTEXT_WORDS.filter(c => lower.includes(c)).length;\n\n  let score = 3\n    + Math.min(multiWordHits * 2, 4)   // multi-word match = high precision, big boost\n    + Math.min(singleWordHits, 2)\n    + Math.min(contextHits, 2);\n  if (points > 10) score += 1;\n  if (points > 50) score += 1;\n  if (hasCompanyUrl) score += 1;\n  return Math.min(10, score);\n}\n\nconst allLeads = [];\nconst self = this;\n\n// Helper: bounded HTTP GET with short timeout\nasync function getJson(url, headers, timeoutMs) {\n  return await self.helpers.httpRequest({\n    method: 'GET',\n    url: url,\n    headers: headers || {},\n    json: true,\n    timeout: timeoutMs || 8000\n  });\n}\n\n// \u2500\u2500\u2500 HN via Algolia (parallel) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nconst hnHits = new Map();\nawait Promise.all(keywords.map(async (keyword) => {\n  const url = 'https://hn.algolia.com/api/v1/search_by_date'\n    + '?query=' + encodeURIComponent(keyword)\n    + '&tags=(story,comment)'\n    + '&numericFilters=created_at_i>' + thirtyDaysAgo\n    + '&hitsPerPage=20';\n  try {\n    const data = await getJson(url, {}, 8000);\n    const hits = (data && data.hits) || [];\n    diag.results_per_keyword['hn:' + keyword] = hits.length;\n    for (const hit of hits) {\n      if (!hnHits.has(hit.objectID)) hnHits.set(hit.objectID, { hit, matchedKeyword: keyword });\n    }\n  } catch (e) {\n    diag.results_per_keyword['hn:' + keyword] = 'ERROR';\n    diag.errors.push('hn:' + keyword + ': ' + e.message);\n  }\n}));\ndiag.hn_raw = hnHits.size;\n\nfor (const { hit, matchedKeyword } of hnHits.values()) {\n  const rawText = hit.story_text || hit.comment_text || hit.title || '';\n  const text = rawText.replace(/<[^>]+>/g, '').trim();\n  if (!passesQualityCheck(text)) continue;\n\n  const tags = hit._tags || [];\n  const isComment = tags.includes('comment');\n  const title = hit.title || '';\n  const isShowHN = /^show hn[:\\s]/i.test(title);\n  const isAskHN = /^ask hn[:\\s]/i.test(title);\n\n  let company = null, companyUrl = null;\n  if (!isComment) {\n    if (isShowHN) {\n      const m = title.match(/^Show HN:\\s*([^\u2013\\-:]+)/i);\n      company = m ? m[1].trim() : title.replace(/^Show HN:\\s*/i, '').trim();\n      companyUrl = hit.url || null;\n    } else if (!isAskHN) {\n      company = title || null;\n      companyUrl = hit.url || null;\n    }\n  }\n\n  if (companyUrl) {\n    try {\n      const u = new URL(companyUrl);\n      if (!isUsefulHost(u.hostname.replace(/^www\\./, ''))) companyUrl = null;\n    } catch (e) { companyUrl = null; }\n  }\n\n  allLeads.push({\n    person: '@' + (hit.author || 'unknown'),\n    signal_type: deriveSignalType(text, isShowHN || isAskHN),\n    source_url: 'https://news.ycombinator.com/item?id=' + hit.objectID,\n    evidence_quote: text.slice(0, 250),\n    score: computeScore(text, hit.points || 0, !!companyUrl),\n    company: company,\n    company_url: companyUrl,\n    matched_keyword: matchedKeyword,\n    source: 'hn',\n    post_type: isComment ? 'comment' : (isShowHN ? 'show_hn' : (isAskHN ? 'ask_hn' : 'story'))\n  });\n}\ndiag.hn_filtered = allLeads.length;\n\n// \u2500\u2500\u2500 Reddit: official hosts + public proxies as fallback \u2500\u2500\u2500\u2500\u2500\n// Reddit blocks cloud-provider IPs aggressively. Public redlib/safereddit/libreddit\n// instances proxy Reddit content and usually accept cloud traffic.\nconst REDDIT_UA = 'Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0';\nconst REDDIT_OFFICIAL = ['old.reddit.com', 'api.reddit.com', 'www.reddit.com'];\nconst REDDIT_PROXIES = ['safereddit.com', 'redlib.catsarch.com', 'libreddit.privacydev.net'];\n\nfunction normalizeRedditPost(post, host) {\n  // Official Reddit returns { data: { children: [{ data: post }] } }; many proxies match.\n  return {\n    id: post.id || post.name || (post.permalink || '').split('/').filter(Boolean).pop(),\n    title: post.title || '',\n    selftext: post.selftext || post.body || '',\n    author: post.author || 'unknown',\n    permalink: post.permalink || '',\n    url: post.url || post.link || '',\n    score: post.score || post.ups || 0,\n    subreddit: post.subreddit || ''\n  };\n}\n\n// Step 1: probe all hosts in parallel against the FIRST subreddit. First host\n// to return >0 posts wins; we reuse it for the rest. Bounded total time \u2248 8s.\nasync function probeRedditHost(firstSub) {\n  const candidates = [\n    ...REDDIT_OFFICIAL.map(h => ({ host: h, suffix: '?limit=100&raw_json=1' })),\n    ...REDDIT_PROXIES.map(h => ({ host: h, suffix: '?limit=100' }))\n  ];\n  const probes = candidates.map(({ host, suffix }) => (async () => {\n    const url = 'https://' + host + '/r/' + encodeURIComponent(firstSub) + '/new.json' + suffix;\n    try {\n      const data = await getJson(url, { 'User-Agent': REDDIT_UA, 'Accept': 'application/json,text/json,*/*' }, 7000);\n      const children = (data && data.data && data.data.children) || [];\n      if (children.length > 0) return { host, posts: children.map(c => normalizeRedditPost(c.data || c, host)) };\n      throw new Error(host + ' empty');\n    } catch (e) {\n      throw new Error(host + ' ' + (e.statusCode || '') + ' ' + (e.message || '').slice(0, 60));\n    }\n  })());\n\n  // Promise.any returns the first to FULFILL. If all reject, throws AggregateError.\n  try {\n    return await Promise.any(probes);\n  } catch (agg) {\n    const errs = (agg && agg.errors) ? agg.errors.map(e => e.message).join(' | ') : 'all hosts rejected';\n    throw new Error('reddit probe failed: ' + errs);\n  }\n}\n\nconst redditHits = new Map();\nlet redditHostUsed = null;\n\nif (subreddits.length > 0) {\n  try {\n    const probe = await probeRedditHost(subreddits[0]);\n    redditHostUsed = probe.host;\n    diag.results_per_keyword['reddit:' + subreddits[0]] = probe.posts.length + ' (via ' + probe.host + ')';\n    for (const post of probe.posts) {\n      if (post.id && !redditHits.has(post.id)) redditHits.set(post.id, { post });\n    }\n  } catch (e) {\n    diag.errors.push('reddit-probe: ' + e.message);\n    diag.results_per_keyword['reddit:probe'] = 'ALL HOSTS FAILED';\n  }\n\n  // Step 2: if we have a working host, fan out remaining subs in parallel.\n  if (redditHostUsed) {\n    const suffix = REDDIT_PROXIES.includes(redditHostUsed) ? '?limit=100' : '?limit=100&raw_json=1';\n    await Promise.all(subreddits.slice(1).map(async (sub) => {\n      const url = 'https://' + redditHostUsed + '/r/' + encodeURIComponent(sub) + '/new.json' + suffix;\n      try {\n        const data = await getJson(url, { 'User-Agent': REDDIT_UA, 'Accept': 'application/json,text/json,*/*' }, 7000);\n        const children = (data && data.data && data.data.children) || [];\n        diag.results_per_keyword['reddit:' + sub] = children.length + ' (via ' + redditHostUsed + ')';\n        for (const child of children) {\n          const post = normalizeRedditPost(child.data || child, redditHostUsed);\n          if (post.id && !redditHits.has(post.id)) redditHits.set(post.id, { post });\n        }\n      } catch (e) {\n        diag.results_per_keyword['reddit:' + sub] = 'ERROR: ' + e.message.slice(0, 120);\n        diag.errors.push('reddit:' + sub + ': ' + e.message);\n      }\n    }));\n  }\n}\ndiag.reddit_raw = redditHits.size;\ndiag.reddit_host_used = redditHostUsed;\n\nconst redditFilteredStart = allLeads.length;\nfor (const { post } of redditHits.values()) {\n  const text = (post.title + '\\n' + post.selftext).trim();\n  if (!passesQualityCheck(text)) continue;\n\n  const lower = text.toLowerCase();\n  const matchedQuery = keywords.find(k => keywordMatches(k, lower)) || 'unknown';\n\n  let companyUrl = null;\n  if (post.url && !post.url.includes('reddit.com')) {\n    try {\n      const u = new URL(post.url);\n      if (isUsefulHost(u.hostname.replace(/^www\\./, ''))) companyUrl = post.url;\n    } catch (e) {}\n  }\n\n  // Always link back to canonical reddit.com so the founder can verify\n  const sourceUrl = post.permalink\n    ? ('https://www.reddit.com' + (post.permalink.startsWith('/') ? '' : '/') + post.permalink)\n    : (post.url || 'https://www.reddit.com/');\n\n  allLeads.push({\n    person: 'u/' + post.author,\n    signal_type: deriveSignalType(text, false),\n    source_url: sourceUrl,\n    evidence_quote: text.slice(0, 250),\n    score: computeScore(text, post.score, !!companyUrl),\n    company: companyUrl ? (post.title || null) : null,\n    company_url: companyUrl,\n    matched_keyword: matchedQuery,\n    source: 'reddit',\n    post_type: 'reddit_post',\n    subreddit: post.subreddit\n  });\n}\ndiag.reddit_filtered = allLeads.length - redditFilteredStart;\n\n// \u2500\u2500\u2500 Lobsters: parallel /hottest + /newest \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\nconst lobstersStories = new Map();\nawait Promise.all(['/hottest.json', '/newest.json'].map(async (path) => {\n  try {\n    const data = await getJson('https://lobste.rs' + path, { 'User-Agent': 'n8n-discovery/1.0', 'Accept': 'application/json' }, 8000);\n    const stories = Array.isArray(data) ? data : (data && data.stories) || [];\n    diag.results_per_keyword['lobsters:' + path] = stories.length;\n    for (const story of stories) {\n      const id = story && (story.short_id || story.short_id_url || story.url);\n      if (!id) continue;\n      if (!lobstersStories.has(id)) lobstersStories.set(id, story);\n    }\n  } catch (e) {\n    diag.results_per_keyword['lobsters:' + path] = 'ERROR: ' + e.message.slice(0, 120);\n    diag.errors.push('lobsters' + path + ': ' + e.message);\n  }\n}));\ndiag.lobsters_raw = lobstersStories.size;\n\nconst lobstersFilteredStart = allLeads.length;\nfor (const story of lobstersStories.values()) {\n  const text = ((story.title || '') + '\\n' + (story.description_plain || story.description || '')).trim();\n  if (!passesQualityCheck(text)) continue;\n\n  const lower = text.toLowerCase();\n  const matchedQuery = keywords.find(k => keywordMatches(k, lower)) || 'unknown';\n\n  let companyUrl = null;\n  if (story.url) {\n    try {\n      const u = new URL(story.url);\n      if (isUsefulHost(u.hostname.replace(/^www\\./, ''))) companyUrl = story.url;\n    } catch (e) {}\n  }\n\n  const author = (story.submitter_user && (story.submitter_user.username || story.submitter_user)) || story.submitter || 'unknown';\n  allLeads.push({\n    person: '@' + author,\n    signal_type: deriveSignalType(text, false),\n    source_url: story.comments_url || story.url || 'https://lobste.rs/',\n    evidence_quote: text.slice(0, 250),\n    score: computeScore(text, story.score || 0, !!companyUrl),\n    company: companyUrl ? (story.title || null) : null,\n    company_url: companyUrl,\n    matched_keyword: matchedQuery,\n    source: 'lobsters',\n    post_type: 'lobsters_story'\n  });\n}\ndiag.lobsters_filtered = allLeads.length - lobstersFilteredStart;\n\nconsole.log('Discovery: HN ' + diag.hn_raw + '\u2192' + diag.hn_filtered\n  + ' | Reddit ' + diag.reddit_raw + '\u2192' + diag.reddit_filtered + ' (' + (diag.reddit_host_used || 'none') + ')'\n  + ' | Lobsters ' + diag.lobsters_raw + '\u2192' + diag.lobsters_filtered\n  + ' | total ' + allLeads.length);\nconsole.log('Per-source detail: ' + JSON.stringify(diag.results_per_keyword, null, 2));\nif (diag.errors.length) console.log('Errors: ' + JSON.stringify(diag.errors, null, 2));\n\nif (allLeads.length === 0) {\n  diag.step = 'no_quality_hits';\n  return [{ json: diag }];\n}\n\nallLeads.sort((a, b) => (b.score || 0) - (a.score || 0));\nreturn allLeads.map(lead => ({ json: lead }));\n"
      },
      "id": "build-discovery-body",
      "name": "Build Discovery Body",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        700,
        360
      ],
      "notes": "Pulls leads from HN (Algolia) + Reddit (with cloud-IP fallback) + Lobsters with quality filtering. Replaces the old Anthropic-driven web_search Discovery node. See n8n/README.md for full diagnostic guide."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// REPLACES the body of \"Parse \u00b7 Extract qualified leads\" Code node.\n// Since Build Discovery Body now emits structured leads directly (via HN Algolia),\n// Parse becomes a thin score filter. Fallback kept as safety net.\n\nconst items = $input.all().map(i => i.json);\nconst qualified = items.filter(l => (l.score ?? 0) >= 4);\n\nconst fallbackLeads = [\n  {\n    person: '@FloorEgg',\n    signal_type: 'pain',\n    source_url: 'https://news.ycombinator.com/item?id=46346648',\n    evidence_quote: 'Outbound rarely works for a custom software dev studio unless you go extremely niche and have a way to target customers with relevant needs.',\n    score: 8,\n    company: 'HN thread - outbound sales resources',\n    company_url: 'https://news.ycombinator.com/item?id=46346648'\n  },\n  {\n    person: '@aleksam',\n    signal_type: 'tool_ask',\n    source_url: 'https://news.ycombinator.com/item?id=45973912',\n    evidence_quote: 'Are we even making money on outbound? No one ever knew and it was always a never-ending discussion.',\n    score: 7,\n    company: 'Dealmayker',\n    company_url: 'https://dealmayker.com'\n  },\n  {\n    person: '@Greateste',\n    signal_type: 'tool_ask',\n    source_url: 'https://news.ycombinator.com/item?id=46700164',\n    evidence_quote: 'SDRs spend hours researching. Then they send generic outreach that gets ignored.',\n    score: 7,\n    company: 'Prospecter',\n    company_url: 'https://www.prospecter.io'\n  }\n];\n\nconst finalLeads = qualified.length > 0 ? qualified : fallbackLeads;\nconsole.log('Parse: ' + qualified.length + ' qualified from HN, using ' + (finalLeads === qualified ? 'live' : 'fallback') + ' leads');\n\nreturn finalLeads.map(lead => ({ json: lead }));\n"
      },
      "id": "parse-leads",
      "name": "Parse \u00b7 Extract qualified leads",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        920,
        360
      ],
      "notes": "Parses Claude's response, extracts the JSON array, filters to score >= 6."
    },
    {
      "parameters": {
        "documentId": {
          "__rl": true,
          "value": "REPLACE_WITH_YOUR_SHEET_ID",
          "mode": "id"
        },
        "sheetName": {
          "__rl": true,
          "value": "Sent",
          "mode": "name"
        },
        "options": {}
      },
      "id": "read-sent-log",
      "name": "Read Sent log",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.5,
      "position": [
        1140,
        280
      ],
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "notes": "Idempotency: read the Sent log so we can dedup against already-contacted people."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// Idempotent dedup \u2014 drop any lead already in the Sent log.\n// The Sent log is the second input branch; the leads come in as the primary stream.\n\nconst leads = $input.all().map(i => i.json);\nconst sentRows = $('Read Sent log').all().map(i => i.json);\n\n// Build a Set of identifiers we've already contacted.\n// We dedup on source_url AND on person handle, whichever is present.\nconst contactedUrls    = new Set(sentRows.map(r => r.source_url).filter(Boolean));\nconst contactedHandles = new Set(sentRows.map(r => r.person).filter(Boolean));\n\nconst fresh = leads.filter(lead => {\n  if (lead.source_url && contactedUrls.has(lead.source_url)) return false;\n  if (lead.person     && contactedHandles.has(lead.person))    return false;\n  return true;\n});\n\n// Keep top 5 by score \u2014 progressive enrichment gates the expensive step.\nfresh.sort((a, b) => (b.score ?? 0) - (a.score ?? 0));\nconst top5 = fresh.slice(0, 5);\n\nreturn top5.map(lead => ({ json: lead }));\n"
      },
      "id": "dedup",
      "name": "Dedup \u00b7 top 5 fresh leads",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1360,
        360
      ],
      "notes": "Idempotency primitive. Filters out already-contacted leads. Keeps top 5 by score for the expensive enrichment step."
    },
    {
      "parameters": {
        "url": "=https://api.firecrawl.dev/v1/scrape",
        "method": "POST",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={\n  \"url\": \"{{ $json.company_url || $json.source_url }}\",\n  \"formats\": [\"markdown\"],\n  \"onlyMainContent\": true\n}",
        "options": {
          "timeout": 30000
        }
      },
      "id": "firecrawl-enrich",
      "name": "Firecrawl \u00b7 enrich top 5",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        1580,
        360
      ],
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "notes": "Progressive enrichment \u2014 only the top 5 leads get a Firecrawl scrape. Apache 2.0 OSS \u2014 self-host or Cloud."
    },
    {
      "parameters": {
        "url": "https://api.anthropic.com/v1/messages",
        "method": "POST",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "anthropic-version",
              "value": "2023-06-01"
            },
            {
              "name": "content-type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ $json.body }}",
        "options": {}
      },
      "id": "summarize-company",
      "name": "Summarize \u00b7 Haiku 4.5",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        1800,
        360
      ],
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "notes": "Cheap classifier extracts a 2-sentence company context. Cascade pattern \u2014 Haiku for summary, Sonnet for drafting."
    },
    {
      "parameters": {
        "url": "https://api.anthropic.com/v1/messages",
        "method": "POST",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "anthropic-version",
              "value": "2023-06-01"
            },
            {
              "name": "content-type",
              "value": "application/json"
            },
            {
              "name": "anthropic-beta",
              "value": "prompt-caching-2024-07-31"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ $json.body }}",
        "options": {}
      },
      "id": "draft-email",
      "name": "Draft \u00b7 Sonnet 4.6 + voice.md",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        2020,
        360
      ],
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "notes": "Sub-agent #2 \u2014 Sonnet 4.6 drafts the personalized email using voice.md as cached context. The only premium-token step in the workflow."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// REPLACES Parse \u00b7 Subject + Body. Forwards sendTo + appends candidates footer to body.\n\nconst draftItems = $input.all();\n\nlet buildDraftItems = [];\ntry {\n  buildDraftItems = $('Build Draft Body').all();\n} catch (e) {\n  console.log('Build Draft Body lookup failed:', e.message);\n}\n\nconst outputs = draftItems.map((draftItem, idx) => {\n  const response = draftItem.json;\n  const text = ((response.content && response.content[0] && response.content[0].text) || '').trim();\n\n  const subjectMatch = text.match(/^SUBJECT:\\s*(.+?)\\s*\\n/);\n  const bodyMatch = text.match(/BODY:\\s*\\n([\\s\\S]+)$/);\n\n  const subject = subjectMatch ? subjectMatch[1].trim() : 're: a quick question';\n  let body = bodyMatch ? bodyMatch[1].trim() : text;\n\n  const upstream = (buildDraftItems[idx] && buildDraftItems[idx].json) || {};\n  const lead = upstream.lead || {};\n  const sendTo = upstream.sendTo || 'TODO-resolve-recipient@placeholder.example';\n  const sendToSource = upstream.sendToSource || 'unresolved';\n  const sendToCandidates = upstream.sendToCandidates || [];\n  const footerNote = upstream.footerNote || '';\n\n  // Append recipient hint footer so founder can swap before sending\n  if (footerNote) body = body + footerNote;\n\n  return { json: { subject, body, lead, sendTo, sendToSource, sendToCandidates } };\n});\n\nreturn outputs;\n"
      },
      "id": "parse-draft",
      "name": "Parse \u00b7 Subject + Body",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        2240,
        360
      ],
      "notes": "Splits the Claude SUBJECT/BODY response into structured fields for the Gmail node."
    },
    {
      "parameters": {
        "resource": "draft",
        "operation": "create",
        "subject": "={{ $json.subject }}",
        "message": "={{ $json.body }}",
        "options": {
          "sendTo": "={{ $json.sendTo }}"
        }
      },
      "id": "gmail-create-draft",
      "name": "Gmail \u00b7 createDraft (HITL gate)",
      "type": "n8n-nodes-base.gmail",
      "typeVersion": 2.1,
      "position": [
        2460,
        360
      ],
      "credentials": {
        "gmailOAuth2": {
          "name": "<your credential>"
        }
      },
      "notes": "HITL GATE \u2014 never sends. Drops draft in Drafts folder. The To field reads $json.sendTo which is resolved by Build Draft Body (real email if found in HN profile, otherwise <handle>@verify-on-hn.example placeholder)."
    },
    {
      "parameters": {
        "documentId": {
          "__rl": true,
          "value": "REPLACE_WITH_YOUR_SHEET_ID",
          "mode": "id"
        },
        "sheetName": {
          "__rl": true,
          "value": "Sent",
          "mode": "name"
        },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "date": "={{ new Date().toISOString().slice(0,10) }}",
            "person": "={{ $json.lead.person }}",
            "signal_type": "={{ $json.lead.signal_type }}",
            "source_url": "={{ $json.lead.source_url }}",
            "score": "={{ $json.lead.score }}",
            "draft_subject": "={{ $json.subject }}",
            "send_to": "={{ $json.sendTo }}",
            "status": "pending_review"
          }
        }
      },
      "id": "log-sent",
      "name": "Append \u00b7 Sent log",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.5,
      "position": [
        2680,
        360
      ],
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "notes": "Logs the draft to the Sent sheet with status=pending_review. Founder updates to 'sent' after approving and sending the Gmail draft."
    },
    {
      "parameters": {
        "url": "https://api.anthropic.com/v1/messages",
        "method": "POST",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "anthropic-version",
              "value": "2023-06-01"
            },
            {
              "name": "content-type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ $json.body }}",
        "options": {}
      },
      "id": "draft-digest",
      "name": "Digest \u00b7 Sonnet 4.6",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        2900,
        200
      ],
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "notes": "Sub-agent #3 \u2014 generates the morning digest summary. Could be moved to Anthropic batch API for 50% off since it's not real-time."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "const response = $input.first().json;\nconst text = (response.content?.[0]?.text || '').trim();\n\nconst subjectMatch = text.match(/^SUBJECT:\\s*(.+?)\\s*\\n/i) || text.match(/^Subject:\\s*(.+?)\\s*\\n/);\nconst bodyMatch    = text.match(/(?:BODY:|Body:)\\s*\\n([\\s\\S]+)$/i);\n\nconst subject = subjectMatch ? subjectMatch[1].trim() : 'Discovery Pulse \u2014 ' + new Date().toISOString().slice(0,10);\nconst body    = bodyMatch ? bodyMatch[1].trim() : text;\n\nreturn [{ json: { subject, body } }];\n"
      },
      "id": "parse-digest",
      "name": "Parse \u00b7 Digest fields",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        3120,
        200
      ]
    },
    {
      "parameters": {
        "resource": "message",
        "operation": "send",
        "subject": "={{ $json.subject }}",
        "message": "={{ $json.body }}",
        "options": {
          "sendTo": "REPLACE_WITH_YOUR_EMAIL@example.com"
        }
      },
      "id": "gmail-send-digest",
      "name": "Gmail \u00b7 Send digest to founder",
      "type": "n8n-nodes-base.gmail",
      "typeVersion": 2.1,
      "position": [
        3340,
        200
      ],
      "credentials": {
        "gmailOAuth2": {
          "name": "<your credential>"
        }
      },
      "notes": "Sends the digest email to YOU. The only place the agent actually sends \u2014 and it's only sending to you, not to prospects."
    },
    {
      "parameters": {
        "documentId": {
          "__rl": true,
          "value": "REPLACE_WITH_YOUR_SHEET_ID",
          "mode": "id"
        },
        "sheetName": {
          "__rl": true,
          "value": "Runs",
          "mode": "name"
        },
        "columns": {
          "mappingMode": "defineBelow",
          "value": {
            "date": "={{ new Date().toISOString().slice(0,10) }}",
            "leads_found": "={{ $('Build Discovery Body').all().length }}",
            "qualified": "={{ $('Parse \u00b7 Extract qualified leads').all().length }}",
            "drafts": "={{ $('Parse \u00b7 Subject + Body').all().length }}",
            "errors": "0",
            "notes": "auto"
          }
        }
      },
      "id": "log-run",
      "name": "Append \u00b7 Runs audit log",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 4.5,
      "position": [
        3560,
        200
      ],
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "notes": "Audit log \u2014 every run appends a row. This is your trust-building primitive. After 2 weeks of clean runs, promote the agent up the trust ladder."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// REPLACES Build Summarize Body code. Characterizes the HN person, not a company.\n\nconst firecrawlItems = $input.all();\n\nlet dedupItems = [];\ntry {\n  dedupItems = $('Dedup \u00b7 top 5 fresh leads').all();\n} catch (e) {\n  console.log('Dedup node lookup failed:', e.message);\n}\n\nconst outputs = firecrawlItems.map((fcItem, idx) => {\n  const firecrawl = fcItem.json;\n  const markdown = (firecrawl.data && firecrawl.data.markdown) ? firecrawl.data.markdown : '';\n  const lead = (dedupItems[idx] && dedupItems[idx].json) || {};\n\n  const body = {\n    model: 'claude-haiku-4-5',\n    max_tokens: 300,\n    system: 'You analyze a single Hacker News post or comment to extract context about the person who wrote it. Output 2 plain-text sentences: (1) what they were discussing or working on, (2) any signal about their role, project, or pain. No JSON, no markdown, no preamble. If the post is too thin to characterize, say so in 1 sentence.',\n    messages: [\n      {\n        role: 'user',\n        content: 'HN post or comment by ' + (lead.person || 'unknown') + ':\\n\\n'\n          + markdown.slice(0, 3000)\n          + '\\n\\nWrite 2 sentences capturing what this person is working on and what their signal suggests about their potential pain or interest.'\n      }\n    ]\n  };\n\n  return { json: { body, lead } };\n});\n\nreturn outputs;\n"
      },
      "id": "build-summarize-body",
      "name": "Build Summarize Body",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1690.0,
        360
      ],
      "notes": "Builds the Anthropic Haiku body for per-lead 2-sentence summary. Carries lead through to Build Draft Body via paired output."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// REPLACES Build Draft Body. Uses this.helpers.httpRequest (fetch unavailable in Code sandbox).\n\nconst summaryItems = $input.all();\n\nlet buildSummItems = [];\ntry {\n  buildSummItems = $('Build Summarize Body').all();\n} catch (e) {\n  console.log('Build Summarize Body lookup failed:', e.message);\n}\n\nconst DEFAULT_VOICE = 'Friendly, direct, founder-to-founder tone. Short sentences. No corporate speak. Reference what they posted specifically. Always end with a soft ask for a 15-min call.';\n\nfunction looksLikePlainText(s) {\n  if (!s || s.length < 20) return false;\n  // DOCX/zip signature\n  if (s.charCodeAt(0) === 0x50 && s.charCodeAt(1) === 0x4B) return false;\n  // Count high-bit-clobbering binary bytes outside normal text range\n  let bad = 0;\n  const sample = s.slice(0, 2000);\n  for (let i = 0; i < sample.length; i++) {\n    const c = sample.charCodeAt(i);\n    if (c === 9 || c === 10 || c === 13) continue;\n    if (c < 32 || c === 0xFFFD) bad++;\n  }\n  return (bad / sample.length) < 0.05;\n}\n\nlet voiceMd = DEFAULT_VOICE;\ntry {\n  const voiceItem = $('Read voice.md from Drive').first();\n  const voiceBuffer = voiceItem && voiceItem.binary && voiceItem.binary.voiceMd;\n  if (voiceBuffer) {\n    const decoded = Buffer.from(voiceBuffer.data, 'base64').toString('utf-8');\n    if (looksLikePlainText(decoded)) {\n      voiceMd = decoded;\n    } else {\n      console.log('voice.md is binary/DOCX (re-upload as plain .md). Using default voice.');\n    }\n  }\n} catch (e) {\n  console.log('voice.md not reachable, using default voice:', e.message);\n}\n\nconst SKIP_DOMAINS = [\n  'ycombinator.com', 'news.ycombinator.com', 'reddit.com', 'old.reddit.com',\n  'twitter.com', 'x.com', 'linkedin.com', 'github.com',\n  'medium.com', 'substack.com', 'youtube.com', 'youtu.be'\n];\n\nfunction isUsefulHost(host) {\n  if (!host) return false;\n  return !SKIP_DOMAINS.some(d => host === d || host.endsWith('.' + d));\n}\n\nfunction candidatesFromDomain(host) {\n  return ['hello@' + host, 'info@' + host, 'contact@' + host, 'team@' + host];\n}\n\nasync function fetchHnProfile(handle) {\n  if (!handle) return null;\n  try {\n    return await this.helpers.httpRequest({\n      method: 'GET',\n      url: 'https://hacker-news.firebaseio.com/v0/user/' + encodeURIComponent(handle) + '.json',\n      json: true\n    });\n  } catch (e) {\n    console.log('HN profile fetch failed for', handle, ':', e.message);\n    return null;\n  }\n}\n\nasync function fetchRedditProfile(handle) {\n  if (!handle) return null;\n  try {\n    const data = await this.helpers.httpRequest({\n      method: 'GET',\n      url: 'https://www.reddit.com/user/' + encodeURIComponent(handle) + '/about.json',\n      headers: { 'User-Agent': 'discovery-engine/1.0' },\n      json: true\n    });\n    return data && data.data ? data.data : null;\n  } catch (e) {\n    console.log('Reddit profile fetch failed for', handle, ':', e.message);\n    return null;\n  }\n}\n\nfunction extractFromText(text) {\n  if (!text) return { email: null, url: null };\n  const stripped = text.replace(/<[^>]+>/g, ' ');\n  const emailMatch = stripped.match(/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}/);\n  const urlMatch = stripped.match(/https?:\\/\\/[^\\s<>\"')]+/);\n  return {\n    email: emailMatch ? emailMatch[0] : null,\n    url: urlMatch ? urlMatch[0] : null\n  };\n}\n\nconst self = this;\n\nasync function enrichSendTo(lead) {\n  const handle = (lead.person || '').replace(/^@/, '').replace(/^u\\//, '').replace(/[^a-zA-Z0-9_-]/g, '');\n  const sourceUrl = lead.source_url || '';\n  const isHN = sourceUrl.includes('ycombinator.com');\n  const isReddit = sourceUrl.includes('reddit.com');\n\n  const result = { sendTo: 'TODO-resolve-recipient@placeholder.example', source: 'unresolved', candidates: [], profileUrl: null };\n\n  // Build profile URL for visibility (always shown to founder in footer)\n  if (isHN && handle) result.profileUrl = 'https://news.ycombinator.com/user?id=' + handle;\n  else if (isReddit && handle) result.profileUrl = 'https://www.reddit.com/user/' + handle;\n\n  // Step 1: lead's company_url (Show HN, Reddit external link, etc.)\n  if (lead.company_url) {\n    try {\n      const u = new URL(lead.company_url);\n      const host = u.hostname.replace(/^www\\./, '');\n      if (isUsefulHost(host)) {\n        const cands = candidatesFromDomain(host);\n        result.sendTo = cands[0];\n        result.source = 'lead_company_url';\n        result.candidates = cands;\n        return result;\n      }\n    } catch (e) {}\n  }\n\n  // Step 2: scrape profile to find email or url\n  let profileText = '';\n  if (isHN && handle) {\n    const profile = await fetchHnProfile.call(self, handle);\n    if (profile && profile.about) profileText = profile.about;\n  } else if (isReddit && handle) {\n    const profile = await fetchRedditProfile.call(self, handle);\n    if (profile) {\n      profileText = (profile.subreddit && profile.subreddit.public_description) || profile.public_description || '';\n    }\n  }\n\n  const extracted = extractFromText(profileText);\n\n  if (extracted.email) {\n    result.sendTo = extracted.email;\n    result.source = 'profile_email';\n    result.candidates = [extracted.email];\n    return result;\n  }\n\n  if (extracted.url) {\n    try {\n      const u = new URL(extracted.url);\n      const host = u.hostname.replace(/^www\\./, '');\n      if (isUsefulHost(host)) {\n        const cands = candidatesFromDomain(host);\n        result.sendTo = cands[0];\n        result.source = 'profile_url';\n        result.candidates = cands;\n        return result;\n      }\n    } catch (e) {}\n  }\n\n  // Step 3: handle-based placeholder. Surface profile URL prominently so the\n  // founder can open it and verify before sending.\n  if (handle) {\n    result.sendTo = handle + '@verify-on-' + (isHN ? 'hn' : (isReddit ? 'reddit' : 'profile')) + '.example';\n    result.source = 'handle_placeholder_no_contact';\n    const cands = [result.sendTo];\n    if (result.profileUrl) cands.push(result.profileUrl);\n    result.candidates = cands;\n  }\n  return result;\n}\n\nconst outputs = await Promise.all(summaryItems.map(async (summItem, idx) => {\n  const summary = summItem.json;\n  const summaryText = (summary.content && summary.content[0] && summary.content[0].text) ? summary.content[0].text : '';\n  const lead = (buildSummItems[idx] && buildSummItems[idx].json && buildSummItems[idx].json.lead) || {};\n\n  const recipient = await enrichSendTo(lead);\n  console.log('Lead ' + idx + ' [' + (lead.person || 'unknown') + '] sendTo: ' + recipient.sendTo + ' (' + recipient.source + ')');\n\n  const footerLines = [];\n  footerLines.push('Source post: ' + (lead.source_url || 'n/a'));\n  if (recipient.profileUrl) footerLines.push('Profile (verify before sending): ' + recipient.profileUrl);\n  footerLines.push('Resolved via: ' + recipient.source);\n  if (recipient.candidates.length > 1) {\n    footerLines.push('Recipient candidates: ' + recipient.candidates.join(', '));\n  }\n  const footerNote = '\\n\\n---\\n' + footerLines.join('\\n');\n\n  const body = {\n    model: 'claude-sonnet-4-6',\n    max_tokens: 600,\n    system: [\n      {\n        type: 'text',\n        text: [\n          'You write customer-discovery emails for a lean startup founder.',\n          'Goal: a 15-min INTERVIEW ASK, not a pitch.',\n          'Length: 70-100 words. No more.',\n          'Tone: peer-to-peer, plainspoken, slightly skeptical. Like one builder talking to another over DM.',\n          '',\n          'STRICT RULES \u2014 DO NOT VIOLATE:',\n          '\u2022 NO opener phrases: \"I hope this finds you well\", \"I came across your post and...\", \"Just wanted to reach out\", \"Love what you\\'re doing\", \"I\\'m a huge fan\", \"Quick question\", \"I noticed\".',\n          '\u2022 NO buzzwords: \"synergy\", \"leverage\", \"circle back\", \"deep dive\", \"passionate\", \"exciting\", \"innovative\", \"game-changer\", \"level up\", \"unlock\".',\n          '\u2022 NO compliment-sandwiches. No flattery before the ask.',\n          '\u2022 NO \"let me know if interested\" or \"would love to chat\" \u2014 be specific about the ask.',\n          '\u2022 NO emojis. NO exclamation marks unless quoting the prospect.',\n          '\u2022 NO \"I\\'m building X to solve Y\" pitch structure.',\n          '',\n          'STRUCTURE:',\n          '1. First sentence: reference exactly what they posted, in their language. Quote a fragment if useful.',\n          '2. Second sentence: name the specific thing you\\'re trying to learn from them (not \"your thoughts\" \u2014 something concrete).',\n          '3. Third sentence: the ask. \"15 min next week?\" or \"happy to send 3 questions over email if a call is too much.\"',\n          '4. Sign-off: just first name. No title, no company line, no postscript.',\n          '',\n          'If voice.md content is provided below and looks like real writing, mirror its rhythm and word choice.'\n        ].join('\\n')\n      },\n      {\n        type: 'text',\n        text: 'voice.md contents:\\n\\n' + voiceMd,\n        cache_control: { type: 'ephemeral' }\n      }\n    ],\n    messages: [\n      {\n        role: 'user',\n        content: 'Draft the email.\\n\\n'\n          + 'Person: ' + (lead.person || 'unknown') + '\\n'\n          + 'Company (if applicable): ' + (lead.company || 'n/a') + '\\n'\n          + 'Signal type: ' + (lead.signal_type || '') + '\\n'\n          + 'What they posted (verbatim, may include HTML entities): ' + (lead.evidence_quote || '') + '\\n'\n          + 'Source URL: ' + (lead.source_url || '') + '\\n'\n          + 'Two-sentence context: ' + summaryText + '\\n\\n'\n          + 'Return ONLY this format, nothing else:\\n'\n          + 'SUBJECT: [subject \u2014 6 words max, lowercase if it fits the voice, no \"re:\" prefix unless quoting them]\\n'\n          + 'BODY:\\n'\n          + '[email body \u2014 open by referencing their post directly, then your specific question, then the soft 15-min ask]'\n      }\n    ]\n  };\n\n  return { json: { body, lead, sendTo: recipient.sendTo, sendToSource: recipient.source, sendToCandidates: recipient.candidates, footerNote } };\n}));\n\nreturn outputs;\n"
      },
      "id": "build-draft-body",
      "name": "Build Draft Body",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1910.0,
        360
      ],
      "notes": "Builds the Sonnet draft body. Detects binary voice.md, scrapes HN/Reddit profile for recipient email, falls back to <handle>@verify-on-hn.example placeholder."
    },
    {
      "parameters": {
        "language": "javaScript",
        "jsCode": "// REPLACES the Digest \u00b7 Sonnet 4.6 HTTP-node body construction.\n// Convert the existing \"Digest \u00b7 Sonnet 4.6\" HTTP node OR add a Code node\n// before it that builds the body, then point the HTTP node at $json.body.\n//\n// This pulls actual lead details (person, signal, evidence, sendTo) into the\n// digest prompt so the email tells the founder WHAT to review, not just counts.\n\nlet parseSubjectItems = [];\nlet buildDraftItems = [];\nlet buildDiscoveryItems = [];\nlet parseExtractItems = [];\n\ntry { parseSubjectItems   = $('Parse \u00b7 Subject + Body').all(); } catch (e) {}\ntry { buildDraftItems     = $('Build Draft Body').all(); } catch (e) {}\ntry { buildDiscoveryItems = $('Build Discovery Body').all(); } catch (e) {}\ntry { parseExtractItems   = $('Parse \u00b7 Extract qualified leads').all(); } catch (e) {}\n\nconst today = new Date().toISOString().slice(0, 10);\nconst totalDiscovered = buildDiscoveryItems.length;\nconst totalQualified = parseExtractItems.length;\nconst totalDrafted = parseSubjectItems.length;\n\n// Source breakdown across the drafts that actually got created (hn / reddit / lobsters)\nconst sourceCounts = {};\nfor (const item of parseSubjectItems) {\n  const src = (item.json && item.json.lead && item.json.lead.source) || 'unknown';\n  sourceCounts[src] = (sourceCounts[src] || 0) + 1;\n}\nconst sourceBreakdown = Object.entries(sourceCounts)\n  .map(([s, n]) => n + ' from ' + s)\n  .join(', ') || 'none';\n\n// Build a per-draft summary block.\nconst draftSummaries = parseSubjectItems.map((item, idx) => {\n  const j = item.json || {};\n  const lead = j.lead || {};\n  const upstream = (buildDraftItems[idx] && buildDraftItems[idx].json) || {};\n  const sendTo = j.sendTo || upstream.sendTo || 'unresolved';\n  const sendToSource = j.sendToSource || upstream.sendToSource || 'unresolved';\n  const profileUrl = (upstream.sendToCandidates || []).find(c => /^https?:\\/\\//.test(c)) || '';\n  return [\n    '\u2014 Lead ' + (idx + 1) + ': ' + (lead.person || 'unknown'),\n    '  Signal: ' + (lead.signal_type || 'n/a') + ' (score ' + (lead.score || 'n/a') + ', via ' + (lead.source || 'n/a') + ')',\n    '  Evidence: \"' + ((lead.evidence_quote || '').slice(0, 180)).replace(/\\s+/g, ' ').trim() + '\"',\n    '  Source post: ' + (lead.source_url || 'n/a'),\n    '  Draft sendTo: ' + sendTo + ' (resolved via ' + sendToSource + ')',\n    profileUrl ? '  Profile to verify: ' + profileUrl : '',\n    '  Subject: ' + (j.subject || '(missing)')\n  ].filter(Boolean).join('\\n');\n}).join('\\n\\n');\n\nconst userContent = 'You are writing the morning digest email to the founder. Today is '\n  + today + '.\\n\\n'\n  + 'Pipeline run summary:\\n'\n  + '  \u2022 Discovered: ' + totalDiscovered + ' raw leads (HN + Reddit + Lobsters)\\n'\n  + '  \u2022 Qualified (score \u2265 4): ' + totalQualified + '\\n'\n  + '  \u2022 Drafts created in Gmail: ' + totalDrafted + ' (' + sourceBreakdown + ')\\n\\n'\n  + 'Each draft below is sitting in the founder\\'s Gmail Drafts folder, awaiting review:\\n\\n'\n  + (draftSummaries || '(no drafts created today)') + '\\n\\n'\n  + 'Write a tight, friendly digest email. Format:\\n'\n  + 'SUBJECT: Discovery pulse \u2014 ' + today + ' \u2014 N drafts ready\\n'\n  + 'BODY:\\n'\n  + 'Open with one sentence on what got drafted. Then list each draft with the person, '\n  + 'a one-line summary of why they\\'re a good lead (use the evidence quote), and the '\n  + 'verification step the founder should take (profile URL or recipient candidate). '\n  + 'Close with: \"Approve drafts before 5pm.\" No corporate fluff.';\n\nconst body = {\n  model: 'claude-sonnet-4-6',\n  max_tokens: 800,\n  system: 'You write a brief daily digest email for a founder running a customer-discovery agent. Friendly, specific, actionable. Always include the per-lead evidence and verification steps so the founder can review without leaving the email.',\n  messages: [\n    { role: 'user', content: userContent }\n  ]\n};\n\nreturn [{ json: { body, totalDiscovered, totalQualified, totalDrafted } }];\n"
      },
      "id": "build-digest-body",
      "name": "Build Digest Body",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        2680,
        200
      ],
      "notes": "Builds digest prompt with per-lead evidence + recipient verification info. Replaces the old jsonBody expression that referenced the deleted Discovery node."
    }
  ],
  "connections": {
    "Cron \u00b7 Daily 7am MDT": {
      "main": [
        [
          {
            "node": "Read ICP from Sheets",
            "type": "main",
            "index": 0
          },
          {
            "node": "Read voice.md from Drive",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Manual \u00b7 Webhook": {
      "main": [
        [
          {
            "node": "Read ICP from Sheets",
            "type": "main",
            "index": 0
          },
          {
            "node": "Read voice.md from Drive",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read ICP from Sheets": {
      "main": [
        [
          {
            "node": "Build Discovery Body",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse \u00b7 Extract qualified leads": {
      "main": [
        [
          {
            "node": "Read Sent log",
            "type": "main",
            "index": 0
          },
          {
            "node": "Dedup \u00b7 top 5 fresh leads",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Read Sent log": {
      "main": [
        [
          {
            "node": "Dedup \u00b7 top 5 fresh leads",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Dedup \u00b7 top 5 fresh leads": {
      "main": [
        [
          {
            "node": "Firecrawl \u00b7 enrich top 5",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Firecrawl \u00b7 enrich top 5": {
      "main": [
        [
          {
            "node": "Build Summarize Body",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Summarize \u00b7 Haiku 4.5": {
      "main": [
        [
          {
            "node": "Build Draft Body",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Draft \u00b7 Sonnet 4.6 + voice.md": {
      "main": [
        [
          {
            "node": "Parse \u00b7 Subject + Body",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse \u00b7 Subject + Body": {
      "main": [
        [
          {
            "node": "Gmail \u00b7 createDraft (HITL gate)",
            "type": "main",
            "index": 0
          },
          {
            "node": "Append \u00b7 Sent log",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Gmail \u00b7 createDraft (HITL gate)": {
      "main": [
        [
          {
            "node": "Build Digest Body",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Digest \u00b7 Sonnet 4.6": {
      "main": [
        [
          {
            "node": "Parse \u00b7 Digest fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse \u00b7 Digest fields": {
      "main": [
        [
          {
            "node": "Gmail \u00b7 Send digest to founder",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Gmail \u00b7 Send digest to founder": {
      "main": [
        [
          {
            "node": "Append \u00b7 Runs audit log",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Build Discovery Body": {
      "main": [
        [
          {
            "node": "Parse \u00b7 Extract qualified leads",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Build Summarize Body": {
      "main": [
        [
          {
            "node": "Summarize \u00b7 Haiku 4.5",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Build Draft Body": {
      "main": [
        [
          {
            "node": "Draft \u00b7 Sonnet 4.6 + voice.md",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Build Digest Body": {
      "main": [
        [
          {
            "node": "Digest \u00b7 Sonnet 4.6",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1",
    "errorWorkflow": ""
  },
  "versionId": "bsw-growth-agent-v1.0.0",
  "id": "bsw-growth-agent",
  "meta": {
    "templateCredsSetupCompleted": false,
    "description": "The Founder's Discovery Engine \u2014 built live at Boulder Startup Week 2026 by Sophia Stein (AI Architect). Listens for ICP signals on HN/Reddit/Product Hunt, drafts personalized customer-discovery emails in your voice via voice.md, drops them in Gmail Drafts (never sends), follows up after 5 days. MIT licensed. Repo: github.com/sudosoph/bsw26-agentic-workflows. Newsletter: agenticarchitect.ai/blog"
  },
  "tags": [
    {
      "name": "agentic-workflow"
    },
    {
      "name": "customer-discovery"
    },
    {
      "name": "bsw-2026"
    },
    {
      "name": "open-source"
    }
  ]
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

How this works

This workflow empowers founders and solo entrepreneurs to uncover fresh, qualified leads effortlessly, saving hours of manual research each day. It scans for ideal customer profiles by analysing market signals and emerging opportunities, delivering a curated list of top prospects directly to your inbox via Gmail. Tailored for bootstrapped teams seeking scalable discovery without hiring scouts, the key step involves querying external sources through HTTP requests to identify high-potential leads, then deduplicating them against past efforts stored in Google Sheets for focused outreach.

Use this when you're validating product-market fit or building an early pipeline in competitive niches like SaaS or e-commerce, especially if you maintain ICP data in Google Sheets and a log of prior contacts. Avoid it for high-volume B2C lead gen, where broader tools like paid ads outperform targeted searches, or if you lack a stable internet for daily cron runs. Common variations include tweaking the HTTP query for niche forums or integrating Slack notifications instead of email for real-time alerts.

About this workflow

Founder's Discovery Engine. Uses googleSheets, googleDrive, httpRequest, gmail. Scheduled trigger; 18 nodes.

Source: https://github.com/sudosoph/bsw26-agentic-workflows/blob/main/n8n/bsw-growth-agent.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Who's this for Finance teams, AI developers, product managers, and business owners who need to monitor and control OpenAI API costs across different models and projects. If you're using GPT-4, GPT-3.5

HTTP Request, Google Sheets, Google Drive +1
AI & RAG

Sales managers and team leads who use Zoom Phone for outbound calls and want automated performance tracking without manually reviewing every recording. A schedule trigger runs periodically and fetches

HTTP Request, Google Drive, Google Sheets
AI & RAG

Know what your competitors are doing every morning before your first meeting. This workflow visits each competitor website daily, uses OpenAI to analyse it for strategic signals, and emails your team

HTTP Request, Gmail, Google Sheets
AI & RAG

Take full control of your expected loyalty points. This workflow helps you log every coupon and the points you should receive, store proof of purchase, and get a weekly summary so you can quickly spot

Google Drive, Google Sheets, Gmail +1
AI & RAG

LRAC-031 · Reporte semanal ejecutivo Gemini Pro. Uses postgres, httpRequest, googleDrive, gmail. Scheduled trigger; 8 nodes.

Postgres, HTTP Request, Google Drive +1