This workflow corresponds to n8n.io template #15525 — we link there as the canonical source.
This workflow follows the Agent → OpenAI Chat recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"id": "zqaMsVBh9XGybqUC",
"meta": {
"templateCredsSetupCompleted": true
},
"name": "LLMs.txt Generator with ScrapeGraph AI",
"tags": [],
"nodes": [
{
"id": "0b7f74f3-036c-4e62-9f6a-411e322808f6",
"name": "When clicking \u2018Execute workflow\u2019",
"type": "n8n-nodes-base.manualTrigger",
"position": [
-224,
0
],
"parameters": {},
"typeVersion": 1
},
{
"id": "b46b464e-aa16-4a75-b9c7-e24a7debccfe",
"name": "Wait",
"type": "n8n-nodes-base.wait",
"position": [
528,
0
],
"parameters": {
"amount": 20
},
"typeVersion": 1.1
},
{
"id": "ad894092-0925-4abe-9566-bc4755919ba8",
"name": "Status crawler",
"type": "n8n-nodes-scrapegraphai.scrapegraphAi",
"position": [
752,
0
],
"parameters": {
"resource": "smartcrawler"
},
"credentials": {
"scrapegraphAIApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "a9158707-3c47-42ea-8b12-e7380ec025fe",
"name": "Scraper",
"type": "n8n-nodes-scrapegraphai.scrapegraphAiTool",
"position": [
1776,
160
],
"parameters": {
"resource": "smartscraper"
},
"credentials": {
"scrapegraphAIApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "7862687f-8102-4448-8c7d-73aea3b369b3",
"name": "OpenAI Chat Model",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
"position": [
1536,
176
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "gpt-5.4-mini",
"cachedResultName": "gpt-5.4-mini"
},
"options": {},
"builtInTools": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.3
},
{
"id": "a90364b5-671b-4f11-bbbb-fc724987d07d",
"name": "to Binary",
"type": "n8n-nodes-base.code",
"position": [
1920,
-16
],
"parameters": {
"jsCode": "return items.map(item => {\n\n\tconst content = item.json.output || '';\n\n\treturn {\n\t\tjson: {},\n\t\tbinary: {\n\t\t\tdata: {\n\t\t\t\tdata: Buffer.from(content).toString('base64'),\n\t\t\t\tmimeType: 'text/plain',\n\t\t\t\tfileName: 'llms.txt'\n\t\t\t}\n\t\t}\n\t};\n\n});"
},
"typeVersion": 2
},
{
"id": "6c0e809b-3aca-4794-88aa-7bc51a88e02b",
"name": "LLMS.txt Agent",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
1568,
-16
],
"parameters": {
"text": "={{ JSON.stringify($json.internal_links) }}",
"options": {
"systemMessage": "# Role\nYou are an agent specialized in generating `llms.txt` files compliant with the official specification (llmstxt.org). Your task is to analyze a website starting from a list of internal URLs and produce a structured Markdown file that describes the site optimally for LLMs.\n\n# Input\nYou will receive a JSON with this structure:\n{\n \"internal_links\": [\"https://...\", \"https://...\", ...]\n}\n\n# Available tools\n- **Scraper**: takes a URL as input and returns the page content (title, meta description, headings, main text). You MUST use it for every URL before describing it. Never make up content.\n\n# Operating procedure\n\n## Step 1 \u2014 Homepage analysis\nIdentify the homepage (shortest URL, typically the domain root) and call `Scraper` on it to extract:\n- Site / company name (from title or H1)\n- Mission / brief description (from meta description or first paragraph)\n- Site language (keep it consistent throughout the file)\n\n## Step 2 \u2014 Internal pages analysis\nFor EVERY other URL in the list, call `Scraper` and extract:\n- Page title (H1 or title tag, cleaned of suffixes like \"| Site Name\")\n- Concise description (max 100-150 characters, based on meta description or first paragraph)\n\nIf a page returns an error, empty content, or duplicate, silently exclude it.\n\n## Step 3 \u2014 Categorization\nGroup URLs into logical sections based on URL patterns and content:\n- `/services/*`, `/servizi/*` \u2192 **Services** section\n- `/products/*`, `/shop/*` \u2192 **Products** section\n- `/portfolio/*`, `/case-study/*`, `/work/*` \u2192 **Portfolio** section\n- `/blog/*`, `/news/*`, `/articles/*` \u2192 **Blog** section\n- `/about`, `/about-us`, `/team` \u2192 **Company** section\n- `/contact`, `/contacts` \u2192 **Contact** section\n- Legal pages (`privacy`, `cookie`, `terms`, `gdpr`) \u2192 **Optional** section\n- Homepage and generic pages \u2192 **Main pages** section\n\n## Step 4 \u2014 Output generation\nCompose the file following EXACTLY this structure:\n\n# [Site name]\n\n> [Site summary in 1-2 sentences, from the homepage]\n\n[Optional paragraph with additional context, only if useful]\n\n## Main pages\n\n- [Title](URL): Concise description\n- [Title](URL): Concise description\n\n## Services\n\n- [Title](URL): Concise description\n\n## Portfolio\n\n- [Title](URL): Concise description\n\n## Contact\n\n- [Title](URL): Concise description\n\n## Optional\n\n- [Title](URL): Concise description\n\n# Strict rules\n1. **ALWAYS use the Scraper tool** for every URL before describing it. Never invent titles or descriptions.\n2. **Preserve the original language** of the site (if it's in Italian \u2192 descriptions in Italian).\n3. **Short and informative descriptions**: max 1-2 sentences, avoid generic promotional phrases (\"the best solution\", \"industry-leading\").\n4. **No external links**, only URLs present in the input list.\n5. **\"Optional\" section** always LAST, reserved for legal and secondary pages.\n6. **Skip empty sections**: do not include section headings without links.\n7. **Pure Markdown output**: return ONLY the content of the `llms.txt` file, without opening/closing backticks, without preambles, without final comments. The first character of your response must be `#`.\n8. **Section order** by importance: Main pages \u2192 Services/Products \u2192 Portfolio \u2192 Blog \u2192 Company \u2192 Contact \u2192 Optional."
},
"promptType": "define"
},
"typeVersion": 3.1
},
{
"id": "e22875c5-a5e2-4fa5-bf71-ca1fa8416069",
"name": "Internal Links",
"type": "n8n-nodes-base.set",
"position": [
1328,
-16
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "47a6ad14-cc77-4b1f-84a0-a8ef731cdc86",
"name": "internal_links",
"type": "array",
"value": "={{ $json.result.llm_result.internal_links }}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "3e3a3143-5426-4f5c-a415-051c4c276691",
"name": "Upload to FTP",
"type": "n8n-nodes-base.ftp",
"position": [
2160,
-16
],
"parameters": {
"path": "=/YOUR_PATH/{{$binary.data.fileName}}",
"options": {},
"operation": "upload"
},
"credentials": {
"ftp": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "b11ff464-37f4-4bee-b2b3-79a215f4ea9c",
"name": "If success",
"type": "n8n-nodes-base.if",
"position": [
1040,
0
],
"parameters": {
"options": {},
"conditions": {
"options": {
"version": 2,
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "ec0239ac-bffb-4187-b7dc-4219536e9f7e",
"operator": {
"type": "string",
"operation": "equals"
},
"leftValue": "={{ $json.status }}",
"rightValue": "success"
}
]
}
},
"typeVersion": 2.2
},
{
"id": "fa5b6da4-daac-446d-80f3-7f4331c521ee",
"name": "Crawler",
"type": "n8n-nodes-scrapegraphai.scrapegraphAi",
"position": [
288,
0
],
"parameters": {
"resource": "smartcrawler"
},
"credentials": {
"scrapegraphAIApi": {
"name": "<your credential>"
}
},
"typeVersion": 1
},
{
"id": "8fa6a432-31e4-4fa5-afa6-9ca350c7cffa",
"name": "Set domain",
"type": "n8n-nodes-base.set",
"position": [
48,
0
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "bf073095-07f9-493d-be86-8dcd9086aecf",
"name": "your_domain",
"type": "string",
"value": "n3w.it"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "6c658166-4348-4092-afb2-c1b670de17a9",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
-64,
-688
],
"parameters": {
"width": 656,
"height": 544,
"content": "## Auto LLMs.txt Generator for websites with ScrapeGraph AI \nThis workflow automatically generates an `llms.txt` file for any given website. It uses ScrapegraphAI to crawl and scrape pages, an OpenAI chat model to process content, and finally uploads the generated file via FTP.\n\n### How it works\n\nThis workflow starts manually, crawls the target domain with ScrapegraphAI, waits until crawling is complete, then extracts all discovered internal links. An OpenAI-powered AI agent uses ScrapegraphAI\u2019s Scraper tool to visit each URL, analyze real page content, identify the site title, description, language, and organize pages into logical `llms.txt` sections.\n\nThe workflow then generates a clean Markdown `llms.txt` file following the llmstxt org structure, converts it into a binary `.txt` file, and uploads it to the configured FTP/CDN path. The agent must scrape every URL before writing descriptions and is not allowed to invent content.\n\n### Setup steps\n\nConfigure n8n credentials for ScrapegraphAI, OpenAI, and FTP, then update the target domain in the **Set domain** node without including `https://`. Adjust the Wait node if the website is large, and set the correct remote upload directory in the FTP node so the generated file is saved as `llms.txt`.\n\nOptionally customize the AI prompt for different sections, languages, or URL exclusions. Save and activate the workflow, execute it from the Manual Trigger node, then verify the uploaded `llms.txt` file on your FTP server\n"
},
"typeVersion": 1
},
{
"id": "7a8c3c7a-6383-470f-973f-396c5f23d36f",
"name": "Sticky Note1",
"type": "n8n-nodes-base.stickyNote",
"position": [
-64,
-112
],
"parameters": {
"color": 7,
"width": 304,
"height": 288,
"content": "## STEP 1 - Target domain\nSet your target domain"
},
"typeVersion": 1
},
{
"id": "50a00e07-5441-4928-b916-c6c60a548bd9",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
256,
-112
],
"parameters": {
"color": 7,
"width": 992,
"height": 288,
"content": "## STEP 2 - Crawling\nStarts a crawl of the specified domain using ScrapegraphAI\u2019s smartcrawler. The crawler extracts all internal links from the domain (acting like a sitemap generator)"
},
"typeVersion": 1
},
{
"id": "a449eb8a-f5e3-4b48-83d0-49b577eab037",
"name": "Sticky Note3",
"type": "n8n-nodes-base.stickyNote",
"position": [
1456,
-112
],
"parameters": {
"color": 7,
"width": 400,
"height": 288,
"content": "## STEP 3 - LLMS.txt Agent\nGenerate a clean Markdown file (llms.txt) following the official spec."
},
"typeVersion": 1
},
{
"id": "9a514eaf-c5d5-473f-bbe0-c692e92224ec",
"name": "Sticky Note4",
"type": "n8n-nodes-base.stickyNote",
"position": [
1872,
-112
],
"parameters": {
"color": 7,
"width": 480,
"height": 288,
"content": "## STEP 4 - Upload to website\nConvert to binary file and Upload to an FTP server"
},
"typeVersion": 1
},
{
"id": "e61beda5-e672-48a0-9f8a-fb1dda591cf5",
"name": "Sticky Note8",
"type": "n8n-nodes-base.stickyNote",
"position": [
624,
-880
],
"parameters": {
"color": 7,
"width": 736,
"height": 736,
"content": "## MY NEW YOUTUBE CHANNEL\n\ud83d\udc49 [Subscribe to my new **YouTube channel**](https://youtube.com/@n3witalia). Here I\u2019ll share videos and Shorts with practical tutorials and **FREE templates for n8n**.\n\n[](https://youtube.com/@n3witalia)"
},
"typeVersion": 1
}
],
"active": false,
"settings": {
"binaryMode": "separate",
"executionOrder": "v1"
},
"versionId": "2ad22f44-bd11-4bab-8001-d4736f757256",
"connections": {
"Wait": {
"main": [
[
{
"node": "Status crawler",
"type": "main",
"index": 0
}
]
]
},
"Crawler": {
"main": [
[
{
"node": "Wait",
"type": "main",
"index": 0
}
]
]
},
"Scraper": {
"ai_tool": [
[
{
"node": "LLMS.txt Agent",
"type": "ai_tool",
"index": 0
}
]
]
},
"to Binary": {
"main": [
[
{
"node": "Upload to FTP",
"type": "main",
"index": 0
}
]
]
},
"If success": {
"main": [
[
{
"node": "Internal Links",
"type": "main",
"index": 0
}
],
[
{
"node": "Wait",
"type": "main",
"index": 0
}
]
]
},
"Set domain": {
"main": [
[
{
"node": "Crawler",
"type": "main",
"index": 0
}
]
]
},
"Internal Links": {
"main": [
[
{
"node": "LLMS.txt Agent",
"type": "main",
"index": 0
}
]
]
},
"LLMS.txt Agent": {
"main": [
[
{
"node": "to Binary",
"type": "main",
"index": 0
}
]
]
},
"Status crawler": {
"main": [
[
{
"node": "If success",
"type": "main",
"index": 0
}
]
]
},
"OpenAI Chat Model": {
"ai_languageModel": [
[
{
"node": "LLMS.txt Agent",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"When clicking \u2018Execute workflow\u2019": {
"main": [
[
{
"node": "Set domain",
"type": "main",
"index": 0
}
]
]
}
}
}
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
ftpopenAiApiscrapegraphAIApi
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
This workflow automatically generates an file (following the llmstxt.org specification) for any given website. It uses ScrapegraphAI to crawl and scrape pages, an OpenAI chat model to process content, and finally uploads the generated file via FTP.
Source: https://n8n.io/workflows/15525/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
This workflow automates the collection, analysis, and reporting of Trustpilot reviews for a specific company using ScrapeGraphAI, transforming unstructured customer feedback into structured insights a
K&S-Media Downloadliste SQL. Uses httpRequest, agent, googleSheets, lmChatOpenAi. Event-driven trigger; 97 nodes.
🎯 Create viral TikToks, Shorts, Reels, podcasts, and ASMR videos in minutes — all on autopilot.
Generate AI viral videos with NanoBanana & VEO3, shared on socials via Blotato 2. Uses @blotato/n8n-nodes-blotato, googleSheets, lmChatOpenAi, toolThink. Event-driven trigger; 94 nodes.
RAG CHATBOT Main. Uses telegram, telegramTrigger, lmChatOpenAi, n8n-nodes-mcp. Event-driven trigger; 87 nodes.