AutomationFlowsWeb Scraping › Crawl Webpage to GitHub Markdown

Crawl Webpage to GitHub Markdown

Original n8n title: 웹페이지 크롤링 → Github 마크다운 저장

웹페이지 크롤링 → GitHub 마크다운 저장. Uses httpRequest, lmChatOpenRouter, outputParserStructured, github. Webhook trigger; 8 nodes.

Webhook trigger★★★☆☆ complexityAI-powered8 nodesHTTP RequestOpenRouter ChatOutput Parser StructuredGitHubAgent
Web Scraping Trigger: Webhook Nodes: 8 Complexity: ★★★☆☆ AI nodes: yes Added:

This workflow follows the Agent → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "\uc6f9\ud398\uc774\uc9c0 \ud06c\ub864\ub9c1 \u2192 GitHub \ub9c8\ud06c\ub2e4\uc6b4 \uc800\uc7a5",
  "nodes": [
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "crawl-to-github",
        "responseMode": "responseNode",
        "options": {}
      },
      "id": "webhook-1",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 2.1,
      "position": [
        0,
        0
      ],
      "onError": "continueRegularOutput"
    },
    {
      "parameters": {
        "url": "=https://r.jina.ai/{{ $('Webhook').item.json.body.url }}",
        "options": {}
      },
      "id": "http-request-1",
      "name": "Jina Reader",
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.3,
      "position": [
        448,
        0
      ],
      "retryOnFail": true,
      "onError": "continueRegularOutput"
    },
    {
      "parameters": {
        "model": "google/gemini-2.0-flash-exp:free",
        "options": {
          "responseFormat": "json_object"
        }
      },
      "id": "openrouter-1",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "typeVersion": 1,
      "position": [
        680,
        224
      ],
      "credentials": {
        "openRouterApi": {
          "name": "<your credential>"
        }
      }
    },
    {
      "parameters": {
        "jsonSchemaExample": "{\n  \"title\": \"example-article-title\",\n  \"cleanedContent\": \"# Article Title\\n\\nThis is the cleaned content...\"\n}"
      },
      "id": "output-parser-1",
      "name": "Structured Output Parser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "typeVersion": 1.3,
      "position": [
        808,
        224
      ]
    },
    {
      "parameters": {
        "resource": "file",
        "owner": {
          "mode": "name",
          "value": "YOUR_GITHUB_USERNAME"
        },
        "repository": {
          "mode": "name",
          "value": "YOUR_GITHUB_REPO"
        },
        "filePath": "=inbox/{{ $json.output.title.replace(/[/\\\\:*?\"<>|]/g, '-') }}.md",
        "fileContent": "=---\nsource: {{ $('Webhook').item.json.body.url }}\ncreated: {{ $now.format('yyyy-MM-dd HH:mm:ss') }}\nmodified: {{ $now.format('yyyy-MM-dd HH:mm:ss') }}\n---\n{{ $json.output.cleanedContent }}",
        "commitMessage": "=Add: {{ $json.output.title }}"
      },
      "id": "github-1",
      "name": "GitHub",
      "type": "n8n-nodes-base.github",
      "typeVersion": 1.1,
      "position": [
        1024,
        0
      ],
      "credentials": {
        "githubApi": {
          "name": "<your credential>"
        }
      },
      "onError": "continueRegularOutput"
    },
    {
      "parameters": {
        "content": "### \ud83d\udce4 \uc0ac\uc6a9 \ubc29\ubc95\n\nCredentials \uc124\uc815 \ud6c4 \uc6cc\ud06c\ud50c\ub85c\uc6b0\ub97c \ud65c\uc131\ud654\ud558\uace0, \ub2e4\uc74c\uacfc \uac19\uc774 \ud638\ucd9c\ud558\uc138\uc694:\n\n```shell\ncurl -X POST https://YOUR_N8N_DOMAIN/webhook/crawl-to-github \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"url\": \"https://example.com/article\"}'\n```\n\n### \u2699\ufe0f \uc124\uc815 \uccb4\ud06c\ub9ac\uc2a4\ud2b8\n- [ ] OpenRouter API credentials \uc124\uc815\n- [ ] GitHub API credentials \uc124\uc815\n- [ ] GitHub owner/repository \uac12 \ubcc0\uacbd\n- [ ] Webhook URL \ud655\uc778 (YOUR_N8N_DOMAIN)\n\n### \ud83d\udd27 \uac1c\uc120\uc0ac\ud56d\n- \ud30c\uc77c\uba85 \ud2b9\uc218\ubb38\uc790 \uc790\ub3d9 \ucc98\ub9ac\n- ISO 8601 \ub0a0\uc9dc \ud615\uc2dd \uc801\uc6a9\n- \ud0c0\uc784\uc2a4\ud0ec\ud504\ub85c \uc911\ubcf5 \ud30c\uc77c \ubc29\uc9c0",
        "height": 400,
        "width": 608
      },
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -16,
        160
      ],
      "typeVersion": 1,
      "id": "sticky-note-1",
      "name": "Sticky Note"
    },
    {
      "parameters": {
        "promptType": "define",
        "text": "=\ub2e4\uc74c \uc6f9\ud398\uc774\uc9c0 \ucf58\ud150\uce20\ub97c \uc815\uc81c\ud574\uc8fc\uc138\uc694:\n\n<webpage contents>\n{{ $json.data }}\n</webpage contents>\n\n<tasks>\n1. \ubd88\ud544\uc694\ud55c \ub124\ube44\uac8c\uc774\uc158, \uad11\uace0, \ud478\ud130 \ub4f1\uc744 \uc81c\uac70\n2. \ub9c8\ud06c\ub2e4\uc6b4 \ud615\uc2dd\uc744 \uae54\ub054\ud558\uac8c \uc815\ub9ac\n3. \ud398\uc774\uc9c0 \uc81c\ubaa9\uc744 \ucd94\ucd9c\ud558\uac70\ub098, \uc801\uc808\ud55c \uc81c\ubaa9\uc774 \uc5c6\uc73c\uba74 \ucf58\ud150\uce20\ub97c \uae30\ubc18\uc73c\ub85c \uc81c\ubaa9 \uc0dd\uc131\n4. \uc81c\ubaa9\uc740 \ud30c\uc77c\uba85\uc73c\ub85c \uc0ac\uc6a9\ud560 \uc218 \uc788\ub3c4\ub85d \ucc98\ub9ac:\n   - \uc601\ubb38: \uc18c\ubb38\uc790\ub85c \ubcc0\ud658, \uacf5\ubc31\uc740 \ud558\uc774\ud508(-)\uc73c\ub85c \ub300\uccb4\n   - \ud55c\uae00: \uacf5\ubc31\uc740 \ud558\uc774\ud508(-)\uc73c\ub85c \ub300\uccb4\n   - \ud2b9\uc218\ubb38\uc790(/, \\, :, *, ?, \", <, >, |) \uc81c\uac70\n   - \uc608\uc2dc: \"How to Build a REST API\" \u2192 \"how-to-build-a-rest-api\"\n   - \uc608\uc2dc: \"\uae43\ud5c8\ube0c \uc0ac\uc6a9\ubc95 (\ucd08\ubcf4\uc790\uc6a9)\" \u2192 \"\uae43\ud5c8\ube0c-\uc0ac\uc6a9\ubc95-\ucd08\ubcf4\uc790\uc6a9\"\n5. \uc911\ubcf5 \ubc29\uc9c0\ub97c \uc704\ud574 \uc81c\ubaa9\uc5d0 \ud0c0\uc784\uc2a4\ud0ec\ud504 \ucd94\uac00: \"title-{{ new Date().toISOString().replace(/[:.]/g, '-').replace('T', '-').split('Z')[0] }}\"\n</tasks>\n\n<output format>\n\uacb0\uacfc\ub97c JSON \ud615\uc2dd\uc73c\ub85c \ubc18\ud658\ud574\uc8fc\uc138\uc694.\n</output format>",
        "hasOutputParser": true,
        "options": {
          "systemMessage": "\ub2f9\uc2e0\uc740 \uc6f9\ud398\uc774\uc9c0 \ucf58\ud150\uce20\ub97c \uc815\uc81c\ud558\ub294 \uc804\ubb38\uac00\uc785\ub2c8\ub2e4. \ud06c\ub864\ub9c1\ub41c \ub9c8\ud06c\ub2e4\uc6b4 \ucf58\ud150\uce20\uc5d0\uc11c \ubd88\ud544\uc694\ud55c \uc694\uc18c\ub97c \uc81c\uac70\ud558\uace0, \uae54\ub054\ud558\uac8c \ud3ec\ub9f7\ud305\ud558\uc5ec \ubc18\ud658\ud569\ub2c8\ub2e4. \ubc18\ub4dc\uc2dc JSON \ud615\uc2dd\uc73c\ub85c \uc751\ub2f5\ud574\uc57c \ud569\ub2c8\ub2e4."
        }
      },
      "type": "@n8n/n8n-nodes-langchain.agent",
      "typeVersion": 3,
      "position": [
        672,
        0
      ],
      "id": "ai-agent-1",
      "name": "AI Agent",
      "onError": "continueErrorOutput"
    },
    {
      "parameters": {
        "options": {}
      },
      "type": "n8n-nodes-base.respondToWebhook",
      "typeVersion": 1.5,
      "position": [
        1024,
        80
      ],
      "id": "afc3cf4d-5fdb-4082-bb91-40213139b343",
      "name": "Respond to Webhook"
    }
  ],
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "Jina Reader",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Jina Reader": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenRouter Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser": {
      "ai_outputParser": [
        [
          {
            "node": "AI Agent",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "AI Agent": {
      "main": [
        [
          {
            "node": "GitHub",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "GitHub": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "active": false,
  "settings": {
    "executionOrder": "v1",
    "callerPolicy": "workflowsFromSameOwner",
    "availableInMCP": false,
    "timeSavedMode": "fixed",
    "executionTimeout": 300
  },
  "versionId": "1a4df79b-258b-49cb-8ac7-acc0a17c32de",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "id": "j2vq6oU8YndWEZIo",
  "tags": []
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

웹페이지 크롤링 → GitHub 마크다운 저장. Uses httpRequest, lmChatOpenRouter, outputParserStructured, github. Webhook trigger; 8 nodes.

Source: https://github.com/jeongsk/obsidian-github-inbox-sync/blob/5ccbc0fa2890102a9138767924bfca7747b045eb/examples/n8n/n8n-crawl-to-github.json — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This powerful workflow automates the evaluation of new digital tools, websites, or platforms with the goal of assessing their potential impact on your business. By leveraging Telegram for user input,

Telegram Trigger, OpenRouter Chat, Telegram +6
Web Scraping

Automate Your Job Search: Find Job Listings on LinkedIn, Indeed, Glassdoor, Upwork & Adzuna!

HTTP Request, OpenRouter Chat, Agent +3
Web Scraping

⚠️ Important: This workflow uses the Autype community node and requires a self-hosted n8n instance.

Email Read Imap, N8N Nodes Autype, HTTP Request +5
Web Scraping

This n8n workflow automates the creation of AI-generated news recap videos using HeyGen's avatar technology. The template scrapes daily newsletter content, uses AI to generate engaging scripts, and pr

HTTP Request, Agent, OpenRouter Chat
Web Scraping

Who is this for? Make.com consultants, automation specialists, and freelancers who want to catch new client opportunities without manually checking the forum.

Output Parser Structured, OpenRouter Chat, HTTP Request +2