AutomationFlowsWeb Scraping › Daily Insight Email From Structured Web Data with Firecrawl

Daily Insight Email From Structured Web Data with Firecrawl

ByAutomate With Marc @marconi on n8n.io

🔥 Daily Web Scraper & AI Summary with Firecrawl + Email Automation Need to extract and summarize web content from a site that doesn’t have an API? This workflow runs daily to scrape a web page using Firecrawl, summarize the content with OpenAI, and send it directly to your email…

Cron / scheduled trigger★★★★☆ complexityAI-powered16 nodesHTTP RequestGmailOpenAI
Web Scraping Trigger: Cron / scheduled Nodes: 16 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #4654 — we link there as the canonical source.

This workflow follows the Gmail → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "ovPDz1CoTIPih54D",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Daily Insight Email from Structured Web Data with Firecrawl",
  "tags": [],
  "nodes": [
    {
      "id": "a1c5c1a4-c61c-48d7-b803-27ffc6484781",
      "name": "Schedule Trigger",
      "type": "n8n-nodes-base.scheduleTrigger",
      "position": [
        -240,
        -20
      ],
      "parameters": {
        "rule": {
          "interval": [
            {
              "triggerAtHour": 10
            }
          ]
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "654dee5e-2a9a-4164-8e8b-df226b88f065",
      "name": "POST Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        0,
        -20
      ],
      "parameters": {
        "url": "https://api.firecrawl.dev/v1/extract",
        "method": "POST",
        "options": {},
        "jsonBody": "={\n  \"urls\": [\n    \"https://www.quiverquant.com/congresstrading/*\"\n  ],\n  \"prompt\": \"Extract all notable congress member trades over $50,000 USD in the past 1 month. Include the Congress Member's name, party, the Stock/Asset purchased or sold, the Amount of the transaction, and the date of the transaction.\",\n  \"schema\": {\n    \"type\": \"object\",\n    \"properties\": {\n      \"notable_trades\": {\n        \"type\": \"array\",\n        \"items\": {\n          \"type\": \"object\",\n          \"properties\": {\n            \"congress_member_name\": {\n              \"type\": \"string\"\n            },\n            \"party\": {\n              \"type\": \"string\"\n            },\n            \"stock_or_asset\": {\n              \"type\": \"string\"\n            },\n            \"amount\": {\n              \"type\": \"number\"\n            },\n            \"transaction_date\": {\n              \"type\": \"string\"\n            }\n          },\n          \"required\": [\n            \"congress_member_name\",\n            \"stock_or_asset\",\n            \"amount\",\n            \"transaction_date\"\n          ]\n        }\n      }\n    },\n    \"required\": [\n      \"notable_trades\"\n    ]\n  }\n}\n",
        "sendBody": true,
        "specifyBody": "json",
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth"
      },
      "credentials": "[REDACTED]",
      "typeVersion": 4.2
    },
    {
      "id": "6e053bbe-21f0-4da8-97b1-47d7432b26e0",
      "name": "Wait 60 Seconds",
      "type": "n8n-nodes-base.wait",
      "position": [
        240,
        -20
      ],
      "parameters": {
        "amount": 60
      },
      "typeVersion": 1.1
    },
    {
      "id": "354549f4-ff4f-4ca7-a534-ed92fb4a7de7",
      "name": "Gmail",
      "type": "n8n-nodes-base.gmail",
      "position": [
        1480,
        -20
      ],
      "parameters": {
        "sendTo": "[REDACTED]",
        "message": "={{ $json.message.content }}",
        "options": {},
        "subject": "Congress Stock Update",
        "emailType": "text"
      },
      "credentials": "[REDACTED]",
      "typeVersion": 2.1
    },
    {
      "id": "008836a9-6fdb-49db-ab6a-13a3cce0200f",
      "name": "Edit Fields",
      "type": "n8n-nodes-base.set",
      "position": [
        980,
        -20
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "05eb0683-67de-4df0-9110-377931d8c7c9",
              "name": "data.notable_trades",
              "type": "string",
              "value": "={{ $json.data.notable_trades }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "f6331366-b135-44b3-b541-7b2cc46ab281",
      "name": "If",
      "type": "n8n-nodes-base.if",
      "position": [
        680,
        -20
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "e58f46d1-80c6-48ec-8838-22dad87cda75",
              "operator": {
                "type": "array",
                "operation": "empty",
                "singleValue": true
              },
              "leftValue": "={{ $json.data.notable_trades }}",
              "rightValue": ""
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "d5b84c8f-a8e3-42f9-b39e-91fdb54ecf72",
      "name": "Wait 30 seconds",
      "type": "n8n-nodes-base.wait",
      "position": [
        760,
        240
      ],
      "parameters": {
        "amount": 30
      },
      "typeVersion": 1.1
    },
    {
      "id": "c5e41918-7fc2-481a-8d35-a3a835501558",
      "name": "HTTP GET Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        480,
        -20
      ],
      "parameters": {
        "url": "=https://api.firecrawl.dev/v1/extract/{{ $('POST Request').item.json.id }}",
        "options": {},
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth"
      },
      "credentials": "[REDACTED]",
      "typeVersion": 4.2
    },
    {
      "id": "a0d457fb-d9e7-4090-9a66-cb002a4d6c17",
      "name": "Formatting AI",
      "type": "@n8n/n8n-nodes-langchain.openAi",
      "position": [
        1120,
        -20
      ],
      "parameters": {
        "modelId": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o",
          "cachedResultName": "GPT-4O"
        },
        "options": {},
        "messages": {
          "values": [
            {
              "content": "=Please format these trade date  {{ $json.data.notable_trades }} into an easily readable email format."
            }
          ]
        }
      },
      "credentials": "[REDACTED]",
      "typeVersion": 1.8
    },
    {
      "id": "0c820795-eea8-4e43-a2fe-5f43241d459f",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -300,
        -100
      ],
      "parameters": {
        "width": 200,
        "height": 500,
        "content": "Scheduled Trigger"
      },
      "typeVersion": 1
    },
    {
      "id": "a8b80cd2-ac94-465c-9fa5-ab90f4903229",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -80,
        -100
      ],
      "parameters": {
        "color": 7,
        "height": 500,
        "content": "POST Request to Firecrawl"
      },
      "typeVersion": 1
    },
    {
      "id": "99cf6106-6de3-4b81-8e2f-795927192a06",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        180,
        -100
      ],
      "parameters": {
        "color": 2,
        "width": 220,
        "height": 500,
        "content": "Wait for Extract Request to Complete"
      },
      "typeVersion": 1
    },
    {
      "id": "6eb2ce69-6db3-471f-9532-10007b233d61",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        420,
        -100
      ],
      "parameters": {
        "color": 3,
        "width": 520,
        "height": 580,
        "content": "GET Request Loop"
      },
      "typeVersion": 1
    },
    {
      "id": "4f373c54-6d6e-4300-a51d-2f7c30d7f362",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        960,
        -100
      ],
      "parameters": {
        "color": 6,
        "width": 460,
        "height": 580,
        "content": "Formatting Agent"
      },
      "typeVersion": 1
    },
    {
      "id": "48c2b4b0-9dc6-4b10-949d-dc76371c51a6",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1440,
        -100
      ],
      "parameters": {
        "color": 4,
        "width": 220,
        "height": 580,
        "content": "Send to Inbox"
      },
      "typeVersion": 1
    },
    {
      "id": "204d42ff-03f4-4d33-9483-391abfc91858",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -840,
        -140
      ],
      "parameters": {
        "color": 4,
        "width": 500,
        "height": 920,
        "content": "\ud83d\udd25 Daily Web Scraper & AI Summary with Firecrawl + Email Automation\nNeed to extract and summarize web content from a site that doesn\u2019t have an API? This workflow runs daily to scrape a web page using Firecrawl, summarize the content with OpenAI, and send it directly to your email \u2014 fully automated.\n\nWatch Full Video Step-by-step Tutorial Here:\nhttps://www.youtube.com/@Automatewithmarc\n\n\ud83d\udd27 How It Works\nDaily Trigger \u2013 Starts the workflow every 24 hours.\n\nFirecrawl Node \u2013 Crawls and extracts structured data from any web page you specify.\n\nOpenAI Node (Optional) \u2013 Processes and summarizes the raw content using a prompt you control.\n\nGmail Node \u2013 Sends the final summary or content snapshot to your email inbox.\n\n\u2705 Perfect For\nBusiness analysts tracking daily market or industry news\n\nResearchers and founders automating competitive intelligence\n\nAnyone who wants web data delivered without coding or scraping scripts\n\n\ud83e\ude9c Setup Instructions\nFirecrawl API Key \u2013 Sign up and insert your key in the credentials.\n\nUpdate Target URL \u2013 Edit the URL in the Firecrawl node to your desired site.\n\nCustomize the Prompt \u2013 Tailor the OpenAI prompt to extract the insights you want.\n\nConnect Gmail \u2013 Add your Gmail credentials and set your recipient email.\n\n\ud83e\uddf0 Built With\nFirecrawl (Web scraping without code)\n\nOpenAI (For summarizing and insight extraction)\n\nGmail (Automated notifications)\n\nn8n (Workflow automation engine)\n\n"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "cb4a0aff-f40c-4db7-a7d1-8ed157c53f25",
  "connections": {
    "If": {
      "main": [
        [
          {
            "node": "Wait 30 seconds",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Edit Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Edit Fields": {
      "main": [
        [
          {
            "node": "Formatting AI",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "POST Request": {
      "main": [
        [
          {
            "node": "Wait 60 Seconds",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Formatting AI": {
      "main": [
        [
          {
            "node": "Gmail",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait 30 seconds": {
      "main": [
        [
          {
            "node": "HTTP GET Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait 60 Seconds": {
      "main": [
        [
          {
            "node": "HTTP GET Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "HTTP GET Request": {
      "main": [
        [
          {
            "node": "If",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Schedule Trigger": {
      "main": [
        [
          {
            "node": "POST Request",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

🔥 Daily Web Scraper & AI Summary with Firecrawl + Email Automation Need to extract and summarize web content from a site that doesn’t have an API? This workflow runs daily to scrape a web page using Firecrawl, summarize the content with OpenAI, and send it directly to your email…

Source: https://n8n.io/workflows/4654/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

Manually checking websites for updates or competitor changes can be tedious. This workflow automates the process by scraping target pages, capturing screenshots, and analyzing content changes using Fi

@Mendable/N8N Nodes Firecrawl, Notion, OpenAI +2
Web Scraping

📬 What This Workflow Does This workflow automatically scrapes recent high-value congressional stock trades from Quiver Quantitative, summarizes the key transactions, and delivers a neatly formatted re

HTTP Request, Gmail, OpenAI
Web Scraping

Schedule Trigger runs every 6 hours (customizable) Apify Scraper fetches Upwork jobs matching your criteria Deduplication filters out jobs you've already seen AI Scoring (GPT-4) evaluates fit, client

HTTP Request, Google Sheets, OpenAI +2
Web Scraping

This n8n workflow automates the process of collecting job and decision-maker data, crafting AI-generated referral messages, and drafting them in Gmail—all using a combination of Apify, Google Sheets,

Google Sheets, Chain Llm, Google Gemini Chat +3
Web Scraping

This workflow creates an end-to-end Instagram content pipeline that automatically discovers trending content from competitor channels, extracts valuable insights, and generates new high-quality script

HTTP Request, OpenAI, Google Sheets