AutomationFlowsWeb Scraping › Jd Scraper

Jd Scraper

JD Scraper. Uses httpRequest. Webhook trigger; 8 nodes.

Webhook trigger★★★★☆ complexity8 nodesHTTP Request
Web Scraping Trigger: Webhook Nodes: 8 Complexity: ★★★★☆ Added:

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "JD Scraper",
  "nodes": [
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "jd-scraper",
        "responseMode": "responseNode",
        "options": {}
      },
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2,
      "position": [
        240,
        300
      ]
    },
    {
      "parameters": {
        "mode": "runOnceForAllItems",
        "jsCode": "// Normalise input: support both direct JSON and webhook body wrapper\nconst raw = $input.first().json;\nconst body = raw.body || raw;\n\nconst url  = (body.url  || '').trim();\nconst text = (body.text || '').trim();\n\nif (!url && !text) {\n  throw new Error('Provide either url or text in the request body');\n}\n\nreturn [{ json: { url, text, hasText: text.length > 0 } }];\n"
      },
      "id": "jds-validate-01",
      "name": "Validate Input",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        460,
        300
      ]
    },
    {
      "parameters": {
        "conditions": {
          "options": {
            "caseSensitive": true,
            "leftValue": "",
            "typeValidation": "strict"
          },
          "conditions": [
            {
              "id": "cond-has-text",
              "leftValue": "={{ $json.hasText }}",
              "rightValue": true,
              "operator": {
                "type": "boolean",
                "operation": "equals"
              }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
      "name": "Has Text?",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2,
      "position": [
        700,
        300
      ]
    },
    {
      "parameters": {
        "method": "POST",
        "url": "={{ \"https://api.cloudflare.com/client/v4/accounts/\" + $env.CF_ACCOUNT_ID + \"/browser-rendering/content\" }}",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Authorization",
              "value": "=Bearer {{ $env.CF_API_TOKEN }}"
            },
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ { url: $('Validate Input').item.json.url, rejectResourceTypes: ['image', 'font', 'media'] } }}",
        "options": {
          "timeout": 30000,
          "response": {
            "response": {
              "fullResponse": true
            }
          }
        }
      },
      "id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
      "name": "Cloudflare Scrape",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4,
      "position": [
        940,
        420
      ],
      "continueOnFail": true
    },
    {
      "parameters": {
        "conditions": {
          "options": {
            "caseSensitive": true,
            "leftValue": "",
            "typeValidation": "loose"
          },
          "conditions": [
            {
              "id": "cond-cf-success",
              "leftValue": "={{ $json.body && $json.body.result && $json.body.result.markdown && $json.body.result.markdown.trim().length > 100 }}",
              "rightValue": true,
              "operator": {
                "type": "boolean",
                "operation": "equals"
              }
            }
          ],
          "combinator": "and"
        },
        "options": {}
      },
      "id": "d4e5f6a7-b8c9-0123-defa-234567890123",
      "name": "CF Success?",
      "type": "n8n-nodes-base.if",
      "typeVersion": 2,
      "position": [
        1180,
        420
      ]
    },
    {
      "parameters": {
        "method": "GET",
        "url": "={{ \"https://r.jina.ai/\" + $('Validate Input').item.json.url }}",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Accept",
              "value": "text/plain"
            },
            {
              "name": "X-Return-Format",
              "value": "text"
            }
          ]
        },
        "options": {
          "timeout": 25000
        }
      },
      "id": "e5f6a7b8-c9d0-1234-efab-345678901234",
      "name": "Jina AI Fallback",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4,
      "position": [
        1420,
        540
      ],
      "continueOnFail": true
    },
    {
      "parameters": {
        "mode": "runOnceForAllItems",
        "jsCode": "// Unify output from all three branches into { jd_raw, source }\nconst item = $input.first().json;\n\nlet jd_raw = '';\nlet source  = 'unknown';\n\n// Branch 1: manual text passed directly from Has Text? node\nif (item.hasText === true) {\n  jd_raw = item.text || '';\n  source  = 'manual';\n}\n// Branch 2: Cloudflare returned valid markdown\nelse if (item.body && item.body.result && item.body.result.markdown) {\n  jd_raw = item.body.result.markdown.trim();\n  source  = 'cloudflare';\n}\n// Branch 3: Jina AI returned plain text (various n8n response shapes)\nelse if (typeof item.data === 'string' && item.data.trim().length > 50) {\n  jd_raw = item.data.trim();\n  source  = 'jina';\n}\nelse if (typeof item === 'string' && item.trim().length > 50) {\n  jd_raw = item.trim();\n  source  = 'jina';\n}\n\n// Trim to a reasonable token budget (first ~12 000 chars \u2248 3 000 tokens)\njd_raw = jd_raw.slice(0, 12000);\n\nif (!jd_raw.trim()) {\n  throw new Error('Failed to retrieve job description from any source. Check CF credentials or paste JD text directly.');\n}\n\nreturn [{ json: { jd_raw, source } }];\n"
      },
      "id": "f6a7b8c9-d0e1-2345-fabc-456789012345",
      "name": "Normalize Output",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        1660,
        300
      ]
    },
    {
      "parameters": {
        "respondWith": "allIncomingItems",
        "options": {}
      },
      "id": "a7b8c9d0-e1f2-3456-abcd-567890123456",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1.1,
      "position": [
        1900,
        300
      ]
    }
  ],
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "Validate Input",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Validate Input": {
      "main": [
        [
          {
            "node": "Has Text?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Has Text?": {
      "main": [
        [
          {
            "node": "Normalize Output",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Cloudflare Scrape",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Cloudflare Scrape": {
      "main": [
        [
          {
            "node": "CF Success?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "CF Success?": {
      "main": [
        [
          {
            "node": "Normalize Output",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Jina AI Fallback",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Jina AI Fallback": {
      "main": [
        [
          {
            "node": "Normalize Output",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Normalize Output": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1"
  },
  "staticData": null,
  "tags": [],
  "id": 1,
  "active": true
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

JD Scraper. Uses httpRequest. Webhook trigger; 8 nodes.

Source: https://github.com/1Mangesh1/resume-gen/blob/5e25f2cb1d4429abd1677088fec426394eef0c6b/workflows/jd-scraper.json — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This workflow enables the submission of business-critical URLs via the Google Indexing API and IndexNow.

HTTP Request, XML
Web Scraping

Never miss important website updates again! This workflow automatically tracks changes on dynamic websites (think React apps, JavaScript-heavy sites) and sends you instant email notifications when som

HTTP Request, Google Sheets, Gmail
Web Scraping

This template implements a recursive web crawler inside n8n. Starting from a given URL, it crawls linked pages up to a maximum depth (default: 3), extracts text and links, and returns the collected co

HTTP Request
Web Scraping

Crawl Space & Foundation Repair Intake AI - Vapi MVP (Client Template). Uses httpRequest, googleSheets. Webhook trigger; 14 nodes.

HTTP Request, Google Sheets
Web Scraping

G — Off-Market Lead Ingestor (Apify → Portal API). Uses httpRequest. Webhook trigger; 7 nodes.

HTTP Request