AutomationFlowsAI & RAG › Wttj Job Scraper - Dev Junior - Workflow1

Wttj Job Scraper - Dev Junior - Workflow1

WTTJ Job Scraper - Dev Junior - Workflow1. Uses agent, lmChatVercelAiGateway, @crunchy-bytes/n8n-nodes-puppeteer, telegram. Scheduled trigger; 17 nodes.

Cron / scheduled trigger★★★★☆ complexityAI-powered17 nodesAgentLm Chat Vercel Ai Gateway@Crunchy Bytes/N8N Nodes PuppeteerTelegramPostgres
AI & RAG Trigger: Cron / scheduled Nodes: 17 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow follows the Agent → Postgres recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "WTTJ Job Scraper - Dev Junior - Workflow1",
  "nodes": [
    {
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "=cronExpression",
              "expression": "13 4 6,11,14,17,19,23 * * 1-5"
            }
          ]
        }
      },
      "type": "n8n-nodes-base.scheduleTrigger",
      "typeVersion": 1.3,
      "position": [
        -32,
        64
      ],
      "id": "bc6592b0-5907-4440-9ddb-95b1af94d9cb",
      "name": "Schedule Trigger"
    },
    {
      "parameters": {
        "promptType": "define",
        "text": "=Tu es un assistant d'automatisation web pour analyser des offres d'emploi sur Welcome to the Jungle.\n\nMISSION :\nAnalyser l'offre d'emploi suivante et cr\u00e9er un r\u00e9sum\u00e9 structur\u00e9.\n\nINFORMATIONS DE L'OFFRE :\n- Titre : {{ $json.title }}\n- Entreprise : {{ $json.company }}\n- Localisation : {{ $json.location }}\n- Type de contrat : {{ $json.contractType }}\n- Lien : {{ $json.link }}\n\nOUTILS DISPONIBLES :\nTu as acc\u00e8s \u00e0 un outil Puppeteer pour naviguer sur le web.\n\nPLAN D'ACTION :\n1. Utilise l'outil Puppeteer pour acc\u00e9der \u00e0 l'URL : {{ $json.link }}\n2. Le tool va te retourner le contenu de la page\n3. Analyse ce contenu pour extraire les informations cl\u00e9s\n4. Cr\u00e9e un r\u00e9sum\u00e9 structur\u00e9\n\nR\u00c9SUM\u00c9 \u00c0 PRODUIRE (format markdown) :\n\n\ud83d\udccb **Titre** : {{ $json.title }}\n\ud83c\udfe2 **Entreprise** : {{ $json.company }}\n\ud83d\udccd **Localisation** : [extraire de la page]\n\ud83d\udcdd **Type de contrat** : [extraire : CDI/CDD/Stage/etc.]\n\n**\ud83c\udfaf Missions principales** :\n- [mission 1 - extraite de la description]\n- [mission 2]\n- [mission 3]\n\n**\ud83d\udca1 Comp\u00e9tences requises** :\n- [comp\u00e9tence 1]\n- [comp\u00e9tence 2]\n- [comp\u00e9tence 3]\n\n**\u2728 Points forts de l'offre** :\n- [ce qui rend l'offre attractive]\n- [avantages mentionn\u00e9s]\n\n**\ud83d\udcb0 Salaire** : [si mentionn\u00e9, sinon \"Non sp\u00e9cifi\u00e9\"]\n\n\ud83d\udd17 [Voir l'offre]({{ $json.link }})\n\ud83c\udf10 [Lien externe a r\u00e9cuperer depuis l'outil puppeteer si celui ci est pr\u00e9sent. Si non vide] \n\nIMPORTANT :\n- Utilise l'outil Puppeteer pour r\u00e9cup\u00e9rer le contenu de la page\n- Extrais UNIQUEMENT les informations pr\u00e9sentes dans l'annonce\n- Sois concis et factuel\n- Si une information n'est pas trouv\u00e9e, indique \"Non sp\u00e9cifi\u00e9\"\n\n# R\u00c8GLES DE FORMATAGE OBLIGATOIRES POUR TELEGRAM\n\n\u26a0\ufe0f NETTOYAGE DU TEXTE \u26a0\ufe0f\n\nAvant de g\u00e9n\u00e9rer le r\u00e9sum\u00e9, nettoie TOUS les noms et textes :\n\n1. **Ast\u00e9risques** : \n   - Remplace *** par le mot complet (ex: \"F***\" \u2192 \"FUP\" ou \"F UP\")\n   - N'utilise JAMAIS d'ast\u00e9risques dans les noms d'entreprise\n   - Si censure n\u00e9cessaire, utilise des points : F...\n\n2. **Underscores** :\n   - Remplace _ par des espaces ou supprime-les\n\n3. **Caract\u00e8res sp\u00e9ciaux** :\n   - \u00c9vite les backticks ` dans les noms\n   - Remplace les pipes | par des tirets -\n\n4. **Exemples de nettoyage** :\n   - \"START THE F*** UP\" \u2192 \"START THE FUP\"\n   - \"Company_Name\" \u2192 \"Company Name\"\n   - \"Test`123\" \u2192 \"Test 123\"\n\nIMPORTANT : Ces nettoyages doivent \u00eatre faits AVANT de g\u00e9n\u00e9rer le r\u00e9sum\u00e9 final.\nLe JSON de sortie ne doit contenir AUCUN ast\u00e9risque isol\u00e9.",
        "options": {
          "maxIterations": 5
        }
      },
      "type": "@n8n/n8n-nodes-langchain.agent",
      "typeVersion": 3.1,
      "position": [
        848,
        0
      ],
      "id": "16dee02d-5b54-4b0e-86be-4a9720a23268",
      "name": "AI Agent"
    },
    {
      "parameters": {
        "model": "mistral/devstral-2",
        "options": {
          "maxTokens": 2000,
          "temperature": 0.3
        }
      },
      "type": "@n8n/n8n-nodes-langchain.lmChatVercelAiGateway",
      "typeVersion": 1,
      "position": [
        720,
        208
      ],
      "id": "0e181433-9ad4-44a0-a326-778c129f02ad",
      "name": "Vercel AI Gateway Chat Model",
      "credentials": {
        "vercelAiGatewayApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "url": "={{ $json.link }}",
        "options": {
          "browserWSEndpoint": "ws://browserless:3000",
          "timeout": 30000,
          "waitUntil": "networkidle2"
        }
      },
      "type": "@crunchy-bytes/n8n-nodes-puppeteer.puppeteerTool",
      "typeVersion": 1,
      "position": [
        976,
        768
      ],
      "id": "d1a41e51-621d-42ca-a8fd-7101adca670e",
      "name": "Puppeteer5",
      "disabled": true
    },
    {
      "parameters": {
        "chatId": "YOUR_CHAT_ID",
        "text": "={{ $('SET').item.json.job_description }}",
        "replyMarkup": "inlineKeyboard",
        "inlineKeyboard": {
          "rows": [
            {
              "row": {
                "buttons": [
                  {
                    "text": "=\u2705 Postuler",
                    "additionalFields": {
                      "callback_data": "=postuler_{{ $json.id }}"
                    }
                  },
                  {
                    "text": "=\u274c Passer",
                    "additionalFields": {
                      "callback_data": "=passer_{{ $json.id }}"
                    }
                  }
                ]
              }
            }
          ]
        },
        "additionalFields": {
          "parse_mode": "Markdown"
        }
      },
      "type": "n8n-nodes-base.telegram",
      "typeVersion": 1.2,
      "position": [
        1824,
        0
      ],
      "id": "b090b839-4011-486c-832a-a6ded8991076",
      "name": "Send a text message",
      "credentials": {
        "telegramApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "const scrapedJobs = $('Puppeteer_return_25_jobs').all(); // 25 offres\n\n// G\u00e9rer le cas o\u00f9 Postgres ne retourne rien\nlet seenJobs = [];\ntry {\n  seenJobs = $('SELECT jobs').all();\n} catch(e) {\n  console.log('Aucune offre en BDD (premier lancement)');\n  seenJobs = [];\n}\n\n// Si table vide, retourner toutes les offres\nif (!seenJobs || seenJobs.length === 0) {\n  console.log('Table vide, toutes les offres sont nouvelles');\n  return scrapedJobs;\n}\n\n// Cr\u00e9er un Set (recherche rapide)\nconst seenLinks = new Set(seenJobs.map(j => j.json?.link).filter(Boolean));\n\n// Garder seulement les nouvelles\nconst newJobs = scrapedJobs.filter(job => \n  !seenLinks.has(job.json.link)\n);\n\nconsole.log(`Total scrap\u00e9: ${scrapedJobs.length}`);\nconsole.log(`D\u00e9j\u00e0 vues: ${seenLinks.size}`);\nconsole.log(`Nouvelles: ${newJobs.length}`);\n\nreturn newJobs;"
      },
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        768,
        -192
      ],
      "id": "fc4ae179-afe2-4dbc-9251-2252cc2eaab7",
      "name": "Code in JavaScript"
    },
    {
      "parameters": {
        "operation": "runCustomScript",
        "scriptCode": "// const url = 'https://www.welcometothejungle.com/fr/jobs?query=backend&sortBy=mostRecent&page=1&aroundQuery=worldwide';\nconst url = $json.url;\nconsole.log(`Navigation vers : ${url}`);\n\nawait $page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36');\n\nawait $page.goto(url, {\n  waitUntil: 'networkidle2',\n  timeout: 30000\n});\n\nawait new Promise(resolve => setTimeout(resolve, 3000));\n\n// ===== FERMER POPUPS (VERSION AM\u00c9LIOR\u00c9E) =====\nawait $page.evaluate(() => {\n  // 1. Cookies\n  const cookieBtn = document.querySelector('#axeptio_btn_acceptAll');\n  if (cookieBtn) {\n    cookieBtn.click();\n    console.log('Cookies accept\u00e9s');\n  }\n  \n  // 2. R\u00e9gion (si pr\u00e9sent)\n  const buttons = Array.from(document.querySelectorAll('button'));\n  const regionBtn = buttons.find(b => \n    (b.innerText || '').toLowerCase().includes('rester sur le site fran\u00e7ais')\n  );\n  if (regionBtn) {\n    regionBtn.click();\n    console.log('Popup r\u00e9gion ferm\u00e9');\n  }\n});\n\nawait new Promise(resolve => setTimeout(resolve, 2000));\n\n// ===== EXTRACTION ROBUSTE =====\nconst jobs = await $page.evaluate(() => {\n  const results = [];\n  const jobCards = document.querySelectorAll('li[data-testid=\"search-results-list-item-wrapper\"]');\n  \n  jobCards.forEach((card) => {\n    try {\n      // 1. Lien et titre\n      const link = card.querySelector('a[href*=\"/jobs/\"]');\n      if (!link) return;\n      \n      const h2 = card.querySelector('h2');\n      const title = h2 ? h2.innerText.trim() : '';\n      if (!title || title.length < 3) return;\n      \n      // 2. ENTREPRISE (m\u00e9thode robuste)\n      let company = 'Non sp\u00e9cifi\u00e9';\n      const companySpan = card.querySelector('span.wui-text');\n      if (companySpan && companySpan.innerText) {\n        company = companySpan.innerText.trim();\n      }\n      \n      // Fallback : depuis l'URL\n      if (company === 'Non sp\u00e9cifi\u00e9' || company.includes('Community')) {\n        const urlMatch = link.href.match(/\\/companies\\/([^\\/]+)\\//);\n        if (urlMatch) {\n          company = urlMatch[1].replace(/-/g, ' ');\n        }\n      }\n      \n      // 3. LOCALISATION (m\u00e9thode robuste)\n      let location = 'Non sp\u00e9cifi\u00e9';\n      const locationIcon = card.querySelector('svg[alt=\"Location\"]');\n      if (locationIcon) {\n        // Remonter au parent puis chercher span > span\n        const parentSpan = locationIcon.parentElement;\n        const locationSpan = parentSpan?.querySelector('span > span');\n        if (locationSpan) {\n          location = locationSpan.innerText.trim();\n        }\n      }\n      \n      // 4. DATE (m\u00e9thode ultra-robuste)\n      const defaultDate = new Date();\n      defaultDate.setMonth(defaultDate.getMonth() - 1);\n      let publishedAt = defaultDate.toISOString();\n      \n      const timeElement = card.querySelector('time');\n      if (timeElement) {\n        const datetime = timeElement.getAttribute('datetime');\n        if (datetime) {\n          publishedAt = datetime;\n        } else {\n          const timeText = timeElement.innerText.trim();\n          publishedAt = parseRelativeTime(timeText);\n        }\n      }\n      \n      // 5. CDI ?\n      const cardText = card.innerText.toLowerCase();\n      const isCDI = cardText.includes('cdi');\n      \n      if (isCDI) {\n        results.push({\n          title: title,\n          company: company,\n          location: location,\n          contractType: 'CDI',\n          publishedAt: publishedAt,\n          link: link.href,\n          scrapedAt: new Date().toISOString()\n        });\n      }\n      \n    } catch (e) {\n      console.error('Erreur extraction:', e.message);\n    }\n  });\n  \n  // Helper : Parser \"il y a X heures/jours\"\n  function parseRelativeTime(timeText) {\n    const now = new Date();\n\n    const minuteMatch = timeText.match(/(\\d+)\\s*(minute|min)s?/i);\n    if (minuteMatch) {\n      now.setMinutes(now.getMinutes() - parseInt(minuteMatch[1]));\n      return now.toISOString();\n    }\n    \n    const hourMatch = timeText.match(/(\\d+)\\s*(heure|hour)s?/i);\n    if (hourMatch) {\n      now.setHours(now.getHours() - parseInt(hourMatch[1]));\n      return now.toISOString();\n    }\n    \n    const dayMatch = timeText.match(/(\\d+)\\s*(jour|day)s?/i);\n    if (dayMatch) {\n      now.setDate(now.getDate() - parseInt(dayMatch[1]));\n      return now.toISOString();\n    }\n    \n    const weekMatch = timeText.match(/(\\d+)\\s*(semaine|week)s?/i);\n    if (weekMatch) {\n      now.setDate(now.getDate() - (parseInt(weekMatch[1]) * 7));\n      return now.toISOString();\n    }\n    \n    const monthMatch = timeText.match(/(\\d+)\\s*(mois|month)s?/i);\n    if (monthMatch) {\n      now.setMonth(now.getMonth() - parseInt(monthMatch[1]));\n      return now.toISOString();\n    }\n    \n    return now.toISOString();\n  }\n  \n  return results;\n});\n\n// ===== TRI PAR DATE =====\njobs.sort((a, b) => {\n  // Convertir les dates en timestamps\n  const dateA = new Date(a.publishedAt).getTime();\n  const dateB = new Date(b.publishedAt).getTime();\n  \n  // Si les deux dates sont invalides, garder l'ordre\n  if (isNaN(dateA) && isNaN(dateB)) return 0;\n  \n  // Si dateA invalide, mettre \u00e0 la fin\n  if (isNaN(dateA)) return 1;\n  \n  // Si dateB invalide, mettre \u00e0 la fin\n  if (isNaN(dateB)) return -1;\n  \n  // Sinon, tri d\u00e9croissant (plus r\u00e9cent en premier)\n  return dateB - dateA;\n});\n\nconsole.log(`${jobs.length} offres CDI extraites`);\n\nreturn jobs.map(job => ({ json: job }));",
        "options": {
          "browserWSEndpoint": "ws://browserless:3000",
          "timeout": 30000,
          "waitUntil": "networkidle2"
        }
      },
      "type": "@crunchy-bytes/n8n-nodes-puppeteer.puppeteer",
      "typeVersion": 1,
      "position": [
        400,
        -48
      ],
      "id": "692d0743-49c2-4d49-8bd8-3609e2e34521",
      "name": "Puppeteer_return_25_jobs"
    },
    {
      "parameters": {
        "maxItems": 5
      },
      "type": "n8n-nodes-base.limit",
      "typeVersion": 1,
      "position": [
        720,
        0
      ],
      "id": "cfecc56f-1172-404b-8b82-72fa06b80505",
      "name": "Limit to 1 job"
    },
    {
      "parameters": {
        "operation": "executeQuery",
        "query": "SELECT link FROM wttj_jobs;",
        "options": {}
      },
      "type": "n8n-nodes-base.postgres",
      "typeVersion": 2.6,
      "position": [
        560,
        -192
      ],
      "id": "3face685-9731-42ff-8b05-2526e649ae7e",
      "name": "SELECT jobs",
      "alwaysOutputData": true,
      "credentials": {
        "postgres": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "operation": "runCustomScript",
        "scriptCode": "// R\u00e9cup\u00e9rer l'URL depuis l'item courant\nconst url = ($json?.link || $json?.url || '').trim();\nif (!url) {\n  throw new Error('URL manquante : attendu $json.link ou $json.url');\n}\n\nconsole.log(`Navigation vers : ${url}`);\n\n// Aller sur la page\nawait $page.goto(url, { \n  waitUntil: 'networkidle2', \n  timeout: 30000 \n});\n\nawait new Promise(resolve => setTimeout(resolve, 2000));\n\n// Accepter les cookies\ntry {\n  const cookieBtn = await $page.evaluateHandle(() => {\n    const btns = Array.from(document.querySelectorAll('button'));\n    return btns.find(b => {\n      const text = (b.innerText || '').toLowerCase();\n      return text.includes('accepter') || text.includes('accept');\n    });\n  });\n  \n  if (cookieBtn && cookieBtn.asElement()) {\n    await cookieBtn.click();\n    console.log('\u2705 Cookies accept\u00e9s');\n    await $page.waitForTimeout(1000);\n  }\n} catch(e) {\n  console.log('\u26a0\ufe0f Pas de popup cookies');\n}\n\n// Extraire description + v\u00e9rifier lien externe\nconst jobData = await $page.evaluate(() => {\n  // 1. Description du job\n  const descriptionSelectors = [\n    '[class*=\"description\"]',\n    '[class*=\"job-content\"]',\n    'article',\n    'main',\n    '[class*=\"offer\"]'\n  ];\n  \n  let description = '';\n  \n  for (const selector of descriptionSelectors) {\n    const element = document.querySelector(selector);\n    if (element && element.innerText.length > 100) {\n      description = element.innerText;\n      break;\n    }\n  }\n  \n  if (!description) {\n    description = document.body.innerText;\n  }\n  \n  description = description.substring(0, 5000);\n  \n  // 2. V\u00e9rifier si candidature externe\n  let externalUrl = null;\n  let applicationType = 'internal';\n  \n  const applyButton = document.querySelector('[data-testid=\"job_bottom-button-apply\"]');\n  \n  if (applyButton) {\n    // V\u00e9rifier si c'est un lien <a>\n    if (applyButton.tagName === 'A') {\n      const href = applyButton.href;\n      \n      // Si le lien ne pointe pas vers WTTJ, c'est externe\n      if (href && !href.includes('welcometothejungle.com')) {\n        externalUrl = href;\n        applicationType = 'external';\n      }\n    }\n    \n    // V\u00e9rifier aussi target=\"_blank\" (signe d'ouverture externe)\n    if (applyButton.getAttribute('target') === '_blank') {\n      externalUrl = applyButton.href || externalUrl;\n      applicationType = 'external';\n    }\n  }\n  \n  return {\n    description: description,\n    pageTitle: document.title,\n    url: window.location.href,\n    applicationType: applicationType,\n    externalUrl: externalUrl\n  };\n});\n\nconsole.log(`\u2705 Description extraite : ${jobData.description.length} caract\u00e8res`);\n\nif (jobData.applicationType === 'external') {\n  console.log(`\u26a0\ufe0f Candidature EXTERNE d\u00e9tect\u00e9e : ${jobData.externalUrl}`);\n} else {\n  console.log(`\u2705 Candidature INTERNE (formulaire WTTJ)`);\n}\n\n// Retourner les donn\u00e9es\nreturn [{\n  json: {\n    description: jobData.description,\n    url: jobData.url,\n    applicationType: jobData.applicationType,\n    externalUrl: jobData.externalUrl\n  }\n}];",
        "options": {
          "browserWSEndpoint": "ws://browserless:3000",
          "timeout": 30000,
          "waitUntil": "networkidle2"
        }
      },
      "type": "@crunchy-bytes/n8n-nodes-puppeteer.puppeteerTool",
      "typeVersion": 1,
      "position": [
        1008,
        240
      ],
      "id": "b8bb7eb3-0809-4579-867d-c3fc5fde892f",
      "name": "Puppeteer_summarise_selected_job"
    },
    {
      "parameters": {
        "operation": "executeQuery",
        "query": "INSERT INTO wttj_jobs (link, title, company, application_type, external_url, status, description)\nVALUES (\n  '{{ $json.link }}', \n  '{{ $json.title.replace(/'/g, \"''\") }}', \n  '{{ $json.company.replace(/'/g, \"''\") }}', '{{ $json.application_type }}', '{{ $json.external_url }}',\n  'pending', \n  '{{ $json.job_description.replace(/'/g, \"''\") }}'\n)\nON CONFLICT (link) DO UPDATE SET\n  description = EXCLUDED.description,\n  status = EXCLUDED.status\nRETURNING id, link, title, company, description;",
        "options": {}
      },
      "type": "n8n-nodes-base.postgres",
      "typeVersion": 2.6,
      "position": [
        1648,
        0
      ],
      "id": "f97a5063-6654-4a2b-80aa-580f601c7ade",
      "name": "INSERT jobs",
      "credentials": {
        "postgres": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "operation": "executeQuery",
        "query": "UPDATE wttj_jobs\nSET message_id = {{ $json.result.message_id }}\nWHERE id = {{ $('INSERT jobs').item.json.id }};",
        "options": {}
      },
      "type": "n8n-nodes-base.postgres",
      "typeVersion": 2.6,
      "position": [
        1968,
        0
      ],
      "id": "cf2fa470-3c0b-40f9-9bf1-bf784188759d",
      "name": "UPDATE jobs",
      "credentials": {
        "postgres": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsCode": "// D\u00e9lai al\u00e9atoire entre 3 et 8 secondes\nconst delay = 3000 + Math.random() * 5000;\nconsole.log(`Attente de ${Math.round(delay/1000)} secondes...`);\nawait new Promise(resolve => setTimeout(resolve, delay));\nreturn $input.all();"
      },
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1136,
        0
      ],
      "id": "0fbef9cd-ee16-4d16-8fa7-704a13df823b",
      "name": "Code in JavaScript1"
    },
    {
      "parameters": {
        "assignments": {
          "assignments": [
            {
              "id": "30b500d1-5716-450a-b9c8-7e9c05aefaac",
              "name": "link",
              "value": "={{ $(\"Limit to 1 job\").item.json.link }}",
              "type": "string"
            },
            {
              "id": "d935e3f7-257d-4de9-a8b2-fa2093616d43",
              "name": "title",
              "value": "={{ $('Limit to 1 job').item.json.title }}",
              "type": "string"
            },
            {
              "id": "267de28a-6cd8-4c19-91b6-a4da6da89995",
              "name": "company",
              "value": "={{ $('Limit to 1 job').item.json.company }}",
              "type": "string"
            },
            {
              "id": "5760ada9-91af-46a9-a855-d102134a9cda",
              "name": "application_type",
              "value": "={{ $json.applicationType }}",
              "type": "string"
            },
            {
              "id": "3a6145a8-3692-4e3b-9f51-a248fb89da5c",
              "name": "external_url",
              "value": "={{ $json.externalUrl }}",
              "type": "string"
            },
            {
              "id": "fb97a8e8-dcf2-4ce4-871f-69095dc9e48b",
              "name": "job_description",
              "value": "={{ $json.output }}",
              "type": "string"
            }
          ]
        },
        "options": {}
      },
      "type": "n8n-nodes-base.set",
      "typeVersion": 3.4,
      "position": [
        1488,
        0
      ],
      "id": "4d93f1b1-c976-4257-ae4b-6a39b9ec2fda",
      "name": "SET"
    },
    {
      "parameters": {
        "jsCode": "// R\u00e9cup\u00e9rer TOUS les items depuis le node pr\u00e9c\u00e9dent\nconst allItems = $input.all();\n\nconsole.log(`\ud83d\udce6 ${allItems.length} items re\u00e7us`);\n\n// Traiter chaque item\nconst results = allItems.map((item, index) => {\n  const aiOutput = item.json.output || item.json.summary || item.json.text || item.json.response || '';\n  \n  if (!aiOutput) {\n    console.log(`\u26a0\ufe0f Item ${index + 1} : Sortie IA vide`);\n    return {\n      json: {\n        ...item.json,\n        applicationType: 'internal',\n        externalUrl: null\n      }\n    };\n  }\n  \n  console.log(`\ud83d\udcc4 Item ${index + 1} : Analyse (${aiOutput.length} caract\u00e8res)`);\n  \n  // Extraire le lien externe\n  const externalLinkMatch = aiOutput.match(/\ud83c\udf10\\s*\\[.*?\\]\\((https?:\\/\\/(?!www\\.welcometothejungle\\.com)[^\\)]+)\\)/\n);\n  \n  let externalUrl = null;\n  let applicationType = 'internal';\n  \n  if (externalLinkMatch && externalLinkMatch[1]) {\n    externalUrl = externalLinkMatch[1].trim();\n    applicationType = 'external';\n    console.log(`\u26a0\ufe0f Item ${index + 1} : EXTERNE - ${externalUrl}`);\n  } else {\n    console.log(`\u2705 Item ${index + 1} : INTERNE`);\n  }\n  \n  return {\n    json: {\n      ...item.json,\n      applicationType: applicationType,\n      externalUrl: externalUrl\n    }\n  };\n});\n\nconsole.log(`\u2705 ${results.length} items trait\u00e9s`);\n\nreturn results;"
      },
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1312,
        0
      ],
      "id": "bfe25baa-f570-4a6c-80b8-61cb465c2db1",
      "name": "Code in JavaScript2"
    },
    {
      "parameters": {
        "jsCode": "// R\u00e9cup\u00e9rer les donn\u00e9es envoy\u00e9es par le Router\n// IMPORTANT : Le webhook met les donn\u00e9es dans $json.body\nconst telegramData = $json.body || $json.data || $json;\n\n// Par d\u00e9faut : recherche \"backend\"\nlet query = 'backend';\nlet source = 'cron';\n\n// V\u00e9rifier si c'est un message Telegram\nif (telegramData.message && telegramData.message.text) {\n  const text = telegramData.message.text.trim();\n  \n  console.log(`\ud83d\udce8 Message re\u00e7u : \"${text}\"`);\n  \n  if (text.startsWith('/scrape ')) {\n    query = text.substring(8).trim();\n    \n    if (!query || query.length < 2) {\n      return [{\n        json: {\n          chatId: telegramData.message.chat.id,\n          text: '\u26a0\ufe0f Query trop courte\\n\\nExemple : /scrape software engineer',\n          sendError: true\n        }\n      }];\n    }\n    \n    source = 'telegram';\n    console.log(`\ud83d\udd0d Recherche manuelle : \"${query}\"`);\n  } else if (text === '/help') {\n    return [{\n      json: {\n        chatId: telegramData.message.chat.id,\n        text: '\ud83e\udd16 **Commandes disponibles**\\n\\n/scrape [mots-cl\u00e9s] - Rechercher des offres\\nExemples :\\n  /scrape backend\\n  /scrape software engineer\\n  /scrape fullstack python',\n        sendHelp: true\n      }\n    }];\n  } else {\n    // Autre message -> ignorer\n    console.log('\u26a0\ufe0f Message ignor\u00e9 (pas une commande)');\n    return [];\n  }\n} else {\n  // Trigger Cron\n  console.log(`\u23f0 Recherche automatique : \"${query}\"`);\n}\n\nconst encodedQuery = encodeURIComponent(query);\nconst url = `https://www.welcometothejungle.com/fr/jobs?query=${encodedQuery}&sortBy=mostRecent&page=1&aroundQuery=worldwide`;\n\nconsole.log(`\ud83c\udf10 URL : ${url}`);\n\nreturn [{\n  json: {\n    url: url,\n    query: query,\n    source: source\n  }\n}];"
      },
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        192,
        -48
      ],
      "id": "5f768879-62ef-4f32-a3ab-e17a46fbdf1b",
      "name": "Code in JavaScript3"
    },
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "workflow1-scrape",
        "options": {}
      },
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2.1,
      "position": [
        -32,
        -288
      ],
      "id": "edcef974-942c-4148-b741-96b2dc6940fc",
      "name": "Webhook"
    }
  ],
  "connections": {
    "Schedule Trigger": {
      "main": [
        [
          {
            "node": "Code in JavaScript3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Vercel AI Gateway Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Puppeteer5": {
      "ai_tool": [
        []
      ]
    },
    "AI Agent": {
      "main": [
        [
          {
            "node": "Code in JavaScript1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send a text message": {
      "main": [
        [
          {
            "node": "UPDATE jobs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code in JavaScript": {
      "main": [
        [
          {
            "node": "Limit to 1 job",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Puppeteer_return_25_jobs": {
      "main": [
        [
          {
            "node": "SELECT jobs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Limit to 1 job": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "SELECT jobs": {
      "main": [
        [
          {
            "node": "Code in JavaScript",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Puppeteer_summarise_selected_job": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "INSERT jobs": {
      "main": [
        [
          {
            "node": "Send a text message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code in JavaScript1": {
      "main": [
        [
          {
            "node": "Code in JavaScript2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "SET": {
      "main": [
        [
          {
            "node": "INSERT jobs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code in JavaScript2": {
      "main": [
        [
          {
            "node": "SET",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Code in JavaScript3": {
      "main": [
        [
          {
            "node": "Puppeteer_return_25_jobs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Webhook": {
      "main": [
        [
          {
            "node": "Code in JavaScript3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": true,
  "settings": {
    "executionOrder": "v1",
    "binaryMode": "separate",
    "availableInMCP": false
  },
  "versionId": "1357c0b7-c03a-4846-828b-d78c140e301f",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "id": "dROsD7urpg23yN3ZZ76YY",
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

WTTJ Job Scraper - Dev Junior - Workflow1. Uses agent, lmChatVercelAiGateway, @crunchy-bytes/n8n-nodes-puppeteer, telegram. Scheduled trigger; 17 nodes.

Source: https://github.com/ArthSogh/autoapply-n8n-bot/blob/d2ddb4efbe69bf539e061e7e86ad32ab0eea8372/workflows/W1_Scraper.json — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

Mailgun, OpenAI, OpenAI Chat +8
AI & RAG

System Architecture Two integrated N8N workflows providing automated US stock portfolio management through Telegram:

Output Parser Autofixing, OpenAI Chat, Perplexity +10
AI & RAG

This workflow automatically generates stock market insights for selected tickers (e.g. GAZP, SBER, LKOH) using historical data, technical indicators, and an AI model. The results are then sent to Tele

Agent, OpenRouter Chat, Telegram Trigger +5
AI & RAG

Complete PostgreSQL-backed system: Keyword scoring → AI research → Multi-part content generation → fal.ai Nano Banana image generation → WordPress publishing

WordPress, OpenAI, Perplexity +8
AI & RAG

Chanchito_PROD. Uses googleGemini, postgres, telegram, httpRequest. Scheduled trigger; 94 nodes.

Google Gemini, Postgres, Telegram +4