{
  "id": "3zRqzetfksysBsHC",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Automated Web Scraping and AI Price Analysis \u2014 HTTP + HTML + GPT-4o-mini + Sheets + Gmail",
  "tags": [],
  "nodes": [
    {
      "id": "ad8433eb-0546-417f-b8f1-4766e1f565e5",
      "name": "Overview",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -4288,
        -352
      ],
      "parameters": {
        "color": 4,
        "width": 572,
        "height": 1284,
        "content": "## Automated Web Scraping and AI Price Analysis \u2014 HTTP + HTML + GPT-4o-mini + Google Sheets + Gmail\n\nFor competitor monitoring, price tracking, and product research \u2014 a daily automated scraper that turns any product listing page into a structured dataset with an AI market briefing. Every day at 9 AM the workflow fetches a target website, extracts every product's title and price, sorts them by price, saves them to Google Sheets, and builds a CSV. GPT-4o-mini reads the price statistics and writes a short plain-text market insight. An email is sent with the AI insight in the body and the full product CSV attached. Works on any website \u2014 change the URL and the CSS selectors to match your target.\n\n## How it works\n- **1. Schedule Trigger \u2014 Daily 9AM** fires once per day at 09:00\n- **2. HTTP \u2014 Fetch Website HTML** downloads the full page HTML\n- **3. HTML \u2014 Extract All Products** pulls every product container as an HTML array using a CSS selector\n- **4. Split Out \u2014 One Product at a Time** turns the array into individual items\n- **5. HTML \u2014 Extract Title and Price** extracts the title attribute and price text from each product\n- **6. Sort \u2014 By Price Descending** orders products from highest to lowest price\n- **7. Code \u2014 Add Date and Clean Price** adds the scrape date and a numeric price field\n- **8. Google Sheets \u2014 Save Products** appends every product row (parallel branch)\n- **9. Code \u2014 Compile for AI and Stats** computes min, max, average price and builds the AI context (parallel branch)\n- **10. AI Agent \u2014 Price Analysis** uses GPT-4o-mini to write a short market insight\n- **OpenAI \u2014 GPT-4o-mini Model** language model attached to the AI Agent\n- **11. Convert to CSV File** converts all product rows into a downloadable CSV (parallel branch)\n- **12. Merge \u2014 Combine AI and CSV** waits for both the AI insight and the CSV before continuing, so the email is sent exactly once\n- **13. Code \u2014 Merge CSV and AI for Email** combines the CSV binary with the AI insight and stats\n- **14. Gmail \u2014 Send Report and CSV** emails the HTML report with stats, the AI insight, and the CSV attached\n\n## Set up steps\n1. By default the workflow scrapes books.toscrape.com (a free practice site). To scrape a different site, change the URL in node 2 and update the CSS selectors in nodes 3 and 5 using your browser DevTools (F12 \u2192 Inspect)\n2. In **OpenAI \u2014 GPT-4o-mini Model** \u2014 connect your OpenAI API credential\n3. In **8. Google Sheets \u2014 Save Products** \u2014 connect Google Sheets OAuth2 and replace `YOUR_GOOGLE_SHEET_ID`. Create a tab named Products with columns: Title, Price, Scraped Date\n4. In **14. Gmail \u2014 Send Report and CSV** \u2014 connect Gmail OAuth2 and replace `YOUR_EMAIL_ADDRESS`"
      },
      "typeVersion": 1
    },
    {
      "id": "83db3b8f-6e59-45dc-91ca-21ff94d33c1c",
      "name": "Section \u2014 Daily Fetch and Product Extraction",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -3648,
        0
      ],
      "parameters": {
        "color": 5,
        "width": 676,
        "height": 404,
        "content": "## Daily Fetch and Product Extraction\nSchedule fires daily at 9 AM. HTTP downloads the full page HTML. HTML node extracts every product container as an array using a CSS selector."
      },
      "typeVersion": 1
    },
    {
      "id": "352ece6d-5dd4-4d08-8a7f-fc54d010d92c",
      "name": "Section \u2014 Split, Extract Fields, and Sort",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2928,
        -96
      ],
      "parameters": {
        "color": 6,
        "width": 772,
        "height": 580,
        "content": "## Split, Extract Fields, and Sort\nSplit Out turns the product array into individual items. HTML node extracts title and price per item. Sort orders products by price descending."
      },
      "typeVersion": 1
    },
    {
      "id": "d0813866-c034-418b-949a-3e8d9f5445f7",
      "name": "Section \u2014 Date Cleanup, Sheets Save, Stats, and AI Analysis",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2080,
        -304
      ],
      "parameters": {
        "color": 6,
        "width": 1236,
        "height": 1076,
        "content": "## Date Cleanup, Sheets Save, Stats, and AI Analysis\nCode adds the scrape date and numeric price. Three parallel branches: Sheets saves all rows, Code compiles price stats which GPT-4o-mini analyzes into a market insight, and CSV conversion runs alongside."
      },
      "typeVersion": 1
    },
    {
      "id": "50dffb3d-2089-49cf-a52a-dd53ccbfc57c",
      "name": "Section \u2014 CSV Merge and Email",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -800,
        -80
      ],
      "parameters": {
        "color": 4,
        "width": 900,
        "height": 528,
        "content": "## CSV Merge and Email\nMerge waits for both the AI insight and the CSV file so the email fires exactly once. Code combines the CSV binary with the AI insight and stats. Gmail sends the report with the CSV attached."
      },
      "typeVersion": 1
    },
    {
      "id": "7ddc7053-d3ec-4237-8ad9-11d0b990cdc5",
      "name": "1. Schedule Trigger \u2014 Daily 9AM",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        -3600,
        160
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "triggerAtHour": 9
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "931b2795-2dc2-47d2-a473-de7f789ddb34",
      "name": "2. HTTP \u2014 Fetch Website HTML",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -3360,
        160
      ],
      "parameters": {
        "url": "http://books.toscrape.com",
        "options": {
          "allowUnauthorizedCerts": true
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "679b4962-0efa-4e78-b857-4064344ac2aa",
      "name": "3. HTML \u2014 Extract All Products",
      "type": "n8n-nodes-base.html",
      "position": [
        -3120,
        160
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "extractionValues": {
          "values": [
            {
              "key": "products",
              "cssSelector": ".row > li",
              "returnArray": true,
              "returnValue": "html"
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "5c8eddf2-b5b9-4c87-b905-1330f9bbb92b",
      "name": "4. Split Out \u2014 One Product at a Time",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        -2880,
        160
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "products"
      },
      "typeVersion": 1
    },
    {
      "id": "aa5dd5e8-917e-46ee-9fd2-46e0dbea9c3c",
      "name": "5. HTML \u2014 Extract Title and Price",
      "type": "n8n-nodes-base.html",
      "position": [
        -2640,
        160
      ],
      "parameters": {
        "options": {},
        "operation": "extractHtmlContent",
        "dataPropertyName": "products",
        "extractionValues": {
          "values": [
            {
              "key": "title",
              "attribute": "title",
              "cssSelector": "h3 > a",
              "returnValue": "attribute"
            },
            {
              "key": "price",
              "cssSelector": ".price_color"
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "aed9a00d-5d17-43bd-97d0-24a2ad11de05",
      "name": "6. Sort \u2014 By Price Descending",
      "type": "n8n-nodes-base.sort",
      "position": [
        -2336,
        160
      ],
      "parameters": {
        "options": {},
        "sortFieldsUi": {
          "sortField": [
            {
              "order": "descending",
              "fieldName": "price"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "bb1d31dc-9905-4d03-9f0e-3b2763923b54",
      "name": "7. Code \u2014 Add Date and Clean Price",
      "type": "n8n-nodes-base.code",
      "position": [
        -2016,
        160
      ],
      "parameters": {
        "jsCode": "const today = new Date().toISOString().split('T')[0];\n\nreturn $input.all().map(item => {\n  const raw = item.json;\n  const priceClean = (raw.price || '')\n    .replace(/[^0-9.]/g, '')\n    .trim();\n\n  return {\n    json: {\n      title:       (raw.title || '').trim(),\n      price:       raw.price  || '',\n      priceNum:    parseFloat(priceClean) || 0,\n      scrapedDate: today\n    }\n  };\n});"
      },
      "typeVersion": 2
    },
    {
      "id": "91e7bf4d-9ed3-4eed-96f6-9559228db86a",
      "name": "8. Google Sheets \u2014 Save Products",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        -1696,
        -144
      ],
      "parameters": {
        "columns": {
          "value": {
            "Price": "={{ $json.price }}",
            "Title": "={{ $json.title }}",
            "Scraped Date": "={{ $json.scrapedDate }}"
          },
          "schema": [
            {
              "id": "Title",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Title",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Price",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Price",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "Scraped Date",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "Scraped Date",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [],
          "attemptToConvertTypes": false,
          "convertFieldsToString": true
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "name",
          "value": "Products"
        },
        "documentId": {
          "__rl": true,
          "mode": "id",
          "value": "YOUR_GOOGLE_SHEET_ID"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "ad28aeda-7b42-4d2d-bf98-9595ff0e1853",
      "name": "9. Code \u2014 Compile for AI and Stats",
      "type": "n8n-nodes-base.code",
      "position": [
        -1776,
        336
      ],
      "parameters": {
        "jsCode": "const all = $('7. Code \u2014 Add Date and Clean Price').all();\nconst today = all[0]?.json.scrapedDate || new Date().toISOString().split('T')[0];\n\nconst prices = all.map(i => i.json.priceNum).filter(p => p > 0);\nconst maxPrice = prices.length > 0 ? Math.max(...prices) : 0;\nconst minPrice = prices.length > 0 ? Math.min(...prices) : 0;\nconst avgPrice = prices.length > 0 ? (prices.reduce((a, b) => a + b, 0) / prices.length).toFixed(2) : 0;\nconst total    = all.length;\n\nconst top10 = all.slice(0, 10)\n  .map(i => i.json.title + ' \u2014 ' + i.json.price)\n  .join('\\n');\n\nconst bottom5 = all.slice(-5)\n  .map(i => i.json.title + ' \u2014 ' + i.json.price)\n  .join('\\n');\n\nconst analysisContext = [\n  'SCRAPE DATE: ' + today,\n  'WEBSITE: books.toscrape.com',\n  'TOTAL PRODUCTS FOUND: ' + total,\n  'HIGHEST PRICE: ' + maxPrice,\n  'LOWEST PRICE: ' + minPrice,\n  'AVERAGE PRICE: ' + avgPrice,\n  '',\n  'TOP 10 MOST EXPENSIVE:',\n  top10,\n  '',\n  'BOTTOM 5 CHEAPEST:',\n  bottom5\n].join('\\n');\n\nreturn [{ json: { analysisContext, total, maxPrice, minPrice, avgPrice, today } }];"
      },
      "typeVersion": 2
    },
    {
      "id": "9c53b85a-3688-45b1-8060-04b88b603c10",
      "name": "10. AI Agent \u2014 Price Analysis",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -1536,
        336
      ],
      "parameters": {
        "text": "={{ $json.analysisContext }}",
        "options": {
          "systemMessage": "You are a market research analyst. You have been given product scraping data from an online store. Write a short, useful business intelligence summary.\n\nYour summary must include:\n1. One sentence on overall price range and what it tells about the market segment\n2. One observation about the most expensive products \u2014 what category or pattern do they share?\n3. One observation about the cheapest products\n4. One actionable insight for a buyer, seller, or researcher using this data\n\nKeep it under 150 words. Write in plain text \u2014 no markdown, no bullet points, no headers. Write like a smart analyst giving a quick verbal briefing."
        },
        "promptType": "define"
      },
      "typeVersion": 1.9
    },
    {
      "id": "7823f0d7-f8e9-48e7-a2a4-71859b49ed78",
      "name": "OpenAI \u2014 GPT-4o-mini Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1536,
        544
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini"
        },
        "options": {},
        "builtInTools": {}
      },
      "typeVersion": 1.3
    },
    {
      "id": "ab0a5e6b-9824-413b-af5f-21b2fd835b22",
      "name": "11. Convert to CSV File",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        -1536,
        160
      ],
      "parameters": {
        "options": {
          "fileName": "={{ 'scraped_products_' + $now.toFormat('yyyy-MM-dd') + '.csv' }}"
        }
      },
      "typeVersion": 1.1
    },
    {
      "id": "be0fe5c2-d73f-4c29-88e1-f4ea7ec3294f",
      "name": "12. Merge \u2014 Combine AI and CSV",
      "type": "n8n-nodes-base.merge",
      "position": [
        -672,
        192
      ],
      "parameters": {
        "mode": "combine",
        "options": {},
        "combineBy": "combineAll"
      },
      "typeVersion": 3
    },
    {
      "id": "fdf0809d-11e9-46b9-a986-b0485936c811",
      "name": "13. Code \u2014 Merge CSV and AI for Email",
      "type": "n8n-nodes-base.code",
      "position": [
        -432,
        192
      ],
      "parameters": {
        "jsCode": "const csvBinary = $('11. Convert to CSV File').first();\nconst aiOutput  = $('10. AI Agent \u2014 Price Analysis').first().json.output || 'AI analysis not available.';\nconst stats     = $('9. Code \u2014 Compile for AI and Stats').first().json;\n\nreturn [{\n  json: {\n    aiInsight: aiOutput,\n    total:     stats.total,\n    maxPrice:  stats.maxPrice,\n    minPrice:  stats.minPrice,\n    avgPrice:  stats.avgPrice,\n    today:     stats.today\n  },\n  binary: csvBinary.binary\n}];"
      },
      "typeVersion": 2
    },
    {
      "id": "4b76cc61-a6ba-4a7b-8352-5f31d1dffe88",
      "name": "14. Gmail \u2014 Send Report and CSV",
      "type": "n8n-nodes-base.gmail",
      "position": [
        -192,
        192
      ],
      "parameters": {
        "sendTo": "YOUR_EMAIL_ADDRESS",
        "message": "=<div style=\"font-family:Arial,sans-serif;max-width:640px;\">\n\n  <div style=\"background:#1a1a2e;padding:24px;border-radius:8px 8px 0 0;\">\n    <h1 style=\"color:#fff;margin:0;font-size:20px;\">Daily Web Scraping Report</h1>\n    <p style=\"color:#aaa;margin:6px 0 0;font-size:13px;\">{{ $json.today }} \u2014 books.toscrape.com</p>\n  </div>\n\n  <div style=\"background:#f0f4ff;padding:16px 24px;display:flex;gap:24px;flex-wrap:wrap;\">\n    <div><p style=\"margin:0;font-size:11px;color:#666;text-transform:uppercase;\">Products Found</p><p style=\"margin:4px 0 0;font-size:20px;font-weight:700;color:#1a1a2e;\">{{ $json.total }}</p></div>\n    <div><p style=\"margin:0;font-size:11px;color:#666;text-transform:uppercase;\">Highest Price</p><p style=\"margin:4px 0 0;font-size:20px;font-weight:700;color:#c62828;\">{{ $json.maxPrice }}</p></div>\n    <div><p style=\"margin:0;font-size:11px;color:#666;text-transform:uppercase;\">Lowest Price</p><p style=\"margin:4px 0 0;font-size:20px;font-weight:700;color:#2e7d32;\">{{ $json.minPrice }}</p></div>\n    <div><p style=\"margin:0;font-size:11px;color:#666;text-transform:uppercase;\">Average Price</p><p style=\"margin:4px 0 0;font-size:20px;font-weight:700;color:#1a1a2e;\">{{ $json.avgPrice }}</p></div>\n  </div>\n\n  <div style=\"background:#fff;padding:20px 24px;border-top:1px solid #eee;\">\n    <h2 style=\"font-size:14px;color:#1a1a2e;margin:0 0 10px;\">AI Market Insight</h2>\n    <p style=\"font-size:14px;color:#444;line-height:1.7;margin:0;\">{{ $json.aiInsight }}</p>\n  </div>\n\n  <div style=\"background:#1a1a2e;padding:12px 24px;text-align:center;border-radius:0 0 8px 8px;\">\n    <p style=\"color:#888;font-size:11px;margin:0;\">Full data attached as CSV \u2014 Powered by n8n + GPT-4o-mini</p>\n  </div>\n\n</div>",
        "options": {
          "senderName": "Web Scraping Bot",
          "attachmentsUi": {
            "attachmentsBinary": [
              {}
            ]
          }
        },
        "subject": "=Daily Scrape Report \u2014 {{ $json.total }} Products | {{ $json.today }}"
      },
      "typeVersion": 2.1
    }
  ],
  "active": false,
  "settings": {
    "binaryMode": "separate",
    "executionOrder": "v1"
  },
  "versionId": "2f8ddb22-c7d7-4c9c-a511-2844816a9ea5",
  "nodeGroups": [],
  "connections": {
    "11. Convert to CSV File": {
      "main": [
        [
          {
            "node": "12. Merge \u2014 Combine AI and CSV",
            "type": "main",
            "index": 1
          }
        ]
      ]
    },
    "OpenAI \u2014 GPT-4o-mini Model": {
      "ai_languageModel": [
        [
          {
            "node": "10. AI Agent \u2014 Price Analysis",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "2. HTTP \u2014 Fetch Website HTML": {
      "main": [
        [
          {
            "node": "3. HTML \u2014 Extract All Products",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "10. AI Agent \u2014 Price Analysis": {
      "main": [
        [
          {
            "node": "12. Merge \u2014 Combine AI and CSV",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "6. Sort \u2014 By Price Descending": {
      "main": [
        [
          {
            "node": "7. Code \u2014 Add Date and Clean Price",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "12. Merge \u2014 Combine AI and CSV": {
      "main": [
        [
          {
            "node": "13. Code \u2014 Merge CSV and AI for Email",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "3. HTML \u2014 Extract All Products": {
      "main": [
        [
          {
            "node": "4. Split Out \u2014 One Product at a Time",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "1. Schedule Trigger \u2014 Daily 9AM": {
      "main": [
        [
          {
            "node": "2. HTTP \u2014 Fetch Website HTML",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "5. HTML \u2014 Extract Title and Price": {
      "main": [
        [
          {
            "node": "6. Sort \u2014 By Price Descending",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "7. Code \u2014 Add Date and Clean Price": {
      "main": [
        [
          {
            "node": "8. Google Sheets \u2014 Save Products",
            "type": "main",
            "index": 0
          },
          {
            "node": "9. Code \u2014 Compile for AI and Stats",
            "type": "main",
            "index": 0
          },
          {
            "node": "11. Convert to CSV File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "9. Code \u2014 Compile for AI and Stats": {
      "main": [
        [
          {
            "node": "10. AI Agent \u2014 Price Analysis",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "4. Split Out \u2014 One Product at a Time": {
      "main": [
        [
          {
            "node": "5. HTML \u2014 Extract Title and Price",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "13. Code \u2014 Merge CSV and AI for Email": {
      "main": [
        [
          {
            "node": "14. Gmail \u2014 Send Report and CSV",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}