AutomationFlowsAI & RAG › AI Web Research in Google Sheets with GPT

AI Web Research in Google Sheets with GPT

Original n8n title: Ai-powered Web Research in Google Sheets with Gpt and Bright Data

ByElay Guez @elay96 on n8n.io

Transform any Google Sheets cell into an intelligent web scraper! Type and get AI-filtered result from every website in ~20 seconds.

Webhook trigger★★★★☆ complexityAI-powered22 nodesAgentMcp Client ToolHTTP RequestChain LlmData TableOutput Parser StructuredOpenAI Chat
AI & RAG Trigger: Webhook Nodes: 22 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #10119 — we link there as the canonical source.

This workflow follows the Agent → Chainllm recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "14cd7efc-ef41-418a-af6d-bedef22b8085",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        -480,
        304
      ],
      "parameters": {
        "options": {
          "responseHeaders": {
            "entries": [
              {
                "name": "Content-Type",
                "value": "text/plain; charset=utf-8"
              }
            ]
          }
        },
        "respondWith": "text",
        "responseBody": "={{ $json.output.summary }}"
      },
      "typeVersion": 1.4
    },
    {
      "id": "b33b25ad-aff9-4192-946b-becd709dfe6c",
      "name": "Bright Data Search Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -592,
        -96
      ],
      "parameters": {
        "text": "=You are an AI agent specialized in web search and information retrieval.\n\nQuery: \"{{ $json.output }}\"\n\n## Objective:  \nPerform a focused search and return the most relevant link in **JSON format only**:  \n```json\n{\n  \"link\": \"\"\n}\n\n## Execution Process:\n\n### 1. Run Search\nsearch_engine(\n  query: [the query you constructed],\n  engine: \"google\",\n  limit: 10,\n  country: \"us\" or \"il\" depending on relevance\n)\n\n### 2. Filter Results\n\nPrioritize reliable sources:\nMajor news websites (Reuters, Bloomberg, WSJ, Haaretz, Globes)\nOfficial websites (investor relations, about pages)\nOfficial financial reports (SEC filings, quarterly reports)\nTrusted data sources (Wikipedia, Crunchbase, LinkedIn)\n\nSkip:\nAds and irrelevant marketing content\nUnverified forums\nOutdated content (unless historical info is requested)\nBroken or invalid links\n\nFind a suitable link based on:\nRelevance to the query\nRecency of the source\nSource credibility\n\n### 3. Return Result\n\nRequired format \u2013 JSON only, no additional text:\n{\n  \"link\": \"https://...\"\n}\n\nIf no high-quality result is found, search another source until you find one or return \"\".\n\n## Error Handling:\n\nIf search fails \u2192 try an alternative query\nIf no results found \u2192 return JSON with empty fields\nAlways return valid JSON, with no explanations or extra text",
        "options": {},
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.2
    },
    {
      "id": "d018693a-0fdf-4ae8-ae22-1b4f79bed2a1",
      "name": "Bright Data MCP",
      "type": "@n8n/n8n-nodes-langchain.mcpClientTool",
      "position": [
        -528,
        64
      ],
      "parameters": {
        "include": "selected",
        "options": {
          "timeout": 120000
        },
        "endpointUrl": "https://mcp.brightdata.com/mcp?token=YOUR_TOKEN_HERE&pro=1",
        "includeTools": [
          "search_engine"
        ],
        "serverTransport": "httpStreamable"
      },
      "typeVersion": 1.1
    },
    {
      "id": "ddf8a398-38ae-4da4-a2aa-0fe6a4b3e62b",
      "name": "Webhook Call",
      "type": "n8n-nodes-base.webhook",
      "position": [
        -1664,
        -32
      ],
      "parameters": {
        "path": "brightdata-search",
        "options": {},
        "httpMethod": "POST",
        "responseMode": "responseNode",
        "authentication": "headerAuth"
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 2.1
    },
    {
      "id": "8dfdd46c-ad8f-458a-a403-45910dcc13e5",
      "name": "Adjust Query Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -1120,
        -96
      ],
      "parameters": {
        "text": "=User prompt: {{ $json.userPrompt }}\nPrompt's referral: {{ $json.cellReference }}\n",
        "options": {
          "systemMessage": "=### 1. Request Analysis  \n\nIdentify the query type:  \n- **News/Updates**: \"News about X\", \"What\u2019s happening with X\"  \n- **Financial Data**: \"Revenue\", \"Earnings\", \"Financial Reports\" \n- **Factual Information**: \"Who is the CEO\", \"How many employees\", \"Where is the HQ\"  \n- **Analysis/Comparison**: \"Competitors\", \"Compare to Y\", \"Alternatives\"  \n- **General Research**: Open-ended questions about a topic/field  \n\n### 2. Search Query Construction  \nBuild an optimal query based on the type:\n\n**News**: `\"{{ entity }}\" news [year/month if relevant]`  \n**Financial Data**: `\"{{ entity }}\" revenue earnings \"Q[X] [year]\"` or `financial results`  \n**Factual Information**: `\"{{ entity }}\" [specific fact] official`  \n**Analysis**: `\"{{ entity }}\" analysis competitors market share`  \n\n**Principles:**  \n- Always translate to English to maximize results  \n- Keep double quotes around specific names  \n- Add relevant time \u2014 current time is {{ $now.format('dd/LL/yyyy') }}  \n- Match keywords to the requested information type  \n\n## Examples:\n\n**Input**: Prompt's referral=\"Apple\", user prompt=\"Search for news about the company\"  \noutput example: \"Apple Inc\" news 2025 technology  \n\n**Input**: Prompt's referral=\"Tesla\", user prompt=\"Monthly revenue April\"  \noutput example: \"Tesla\" revenue \"April 2024\" OR \"Q2 2024\" financial results  \n\n**Input**: Prompt's referral=\"United States\", user prompt=\"Who is the president\"  \noutput example: \"United States\" president 19/10/2025 current  \n"
        },
        "promptType": "define"
      },
      "typeVersion": 2.2
    },
    {
      "id": "c12218ae-674c-4bf9-b35b-6a8d6b360c9a",
      "name": "Bright Data - Data Extraction",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -192,
        -48
      ],
      "parameters": {
        "url": "https://api.brightdata.com/request",
        "method": "POST",
        "options": {
          "batching": {
            "batch": {
              "batchSize": 1,
              "batchInterval": 2000
            }
          }
        },
        "sendBody": true,
        "sendHeaders": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "zone",
              "value": "mcp_unlocker"
            },
            {
              "name": "url",
              "value": "={{ $json.output.link }}"
            },
            {
              "name": "format",
              "value": "json"
            },
            {
              "name": "method",
              "value": "GET"
            },
            {
              "name": "country",
              "value": "il"
            },
            {
              "name": "data_format",
              "value": "markdown"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {
              "name": "Authorization",
              "value": "Bearer YOUR_TOKEN_HERE"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "9f8b11b7-f997-4906-a929-2656cdee4204",
      "name": "Extract Data",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "onError": "continueRegularOutput",
      "position": [
        -1600,
        336
      ],
      "parameters": {
        "text": "=## Input\n### The user's original request:\n{{ $('Set Variables').item.json.cellReference }} - {{ $('Set Variables').item.json.userPrompt }}\n### Full content scanned from a website:\n{{ $json.body }}",
        "batching": {},
        "messages": {
          "messageValues": [
            {
              "message": "=# You are an AI agent specialized in extracting relevant information. Your role is to receive:\n\n1. **The user's original request** \u2013 the question or topic requested  \n2. **Full content scanned from a website** \u2013 raw information from a source  \n\n## Objective\n\nExtract and summarize the most relevant information from the scanned content based on the user's query.\n\n## Workflow\n\n1. Carefully read the scanned content  \n2. Identify information directly relevant to the user's query  \n3. Create a focused and concise summary of only the relevant information  \n4. Completely ignore unrelated content  \n\n## Output Rules\n\n- **Return only valid JSON** according to the schema below  \n- Do not include any text outside the JSON  \n- The summary should include **only** information directly relevant to the user's query  \n- Maximum preferred length: **400 characters**  \n- If no relevant information is found, leave the `summary` field empty (`\"\"`)  \n\n## Edge Case Handling\n\n- **Partial or corrupted content:** Extract only the existing and relevant information  \n- **Multiple topics in content:** Focus only on the topic relevant to the user's query  \n- **Different language:** Translate relevant information to {{ $('Set Variables').item.json.ouputLanguage }}  \n- **No relevant information:** Return JSON with an empty `summary`  \n- **Always return valid JSON**, no extra text  \n\n## Key Principles\n\n** Extract factual, accurate, and focused information **  \n** Translate to {{ $('Set Variables').item.json.ouputLanguage }} when needed **  \n** Summarize concisely within 400 characters **  \n\n** Do not add information that is not in the original content **  \n** Do not include information unrelated to the query **  \n** Do not deviate from the JSON format **  \n"
            }
          ]
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 1.7
    },
    {
      "id": "d7a349a0-0158-4ec2-bb94-beebebcbe2bd",
      "name": "Summarize Information",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -1120,
        336
      ],
      "parameters": {
        "text": "=scraping summary information: {{ $json.output.summary }}\nthe actual user request/question: {{ $('Set Variables').item.json.cellReference }} - {{ $('Set Variables').item.json.userPrompt }}\n",
        "options": {
          "systemMessage": "=Generate a conclusion in {{ $('Set Variables').item.json.ouputLanguage }} based on all the information above. Filter and summarize the most relevant information according to the user's question into a maximum of 400 characters.  \nAlways return valid JSON, with no explanations or additional text.  \n"
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.2
    },
    {
      "id": "4ca74428-5c4a-4c3f-8359-95a1df48a4b9",
      "name": "Update Logs",
      "type": "n8n-nodes-base.dataTable",
      "position": [
        -480,
        496
      ],
      "parameters": {
        "columns": {
          "value": {
            "output": "={{ $json.output.summary }}",
            "input_prompt": "={{ $('Set Variables').item.json.userPrompt }} - {{ $('Set Variables').item.json.cellReference }}"
          },
          "schema": [
            {
              "id": "input_prompt",
              "type": "string",
              "display": true,
              "removed": false,
              "readOnly": false,
              "required": false,
              "displayName": "input_prompt",
              "defaultMatch": false
            },
            {
              "id": "output",
              "type": "string",
              "display": true,
              "removed": false,
              "readOnly": false,
              "required": false,
              "displayName": "output",
              "defaultMatch": false
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "logs"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "dataTableId": {
          "__rl": true,
          "mode": "list",
          "value": "vAMxTuckOJDPlMN3",
          "cachedResultUrl": "/projects/qoZIbTB1W6TCihY8/datatables/vAMxTuckOJDPlMN3",
          "cachedResultName": "BrightData Test"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "1abd9a03-bbb3-44cf-b23c-40363d54458e",
      "name": "Structured Output Parser - 1",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -352,
        64
      ],
      "parameters": {
        "jsonSchemaExample": "{\n  \"link\": \"\"\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "0b99dc9d-bc59-4054-b97a-7bc708439a22",
      "name": "Structured Output Parser - 2",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -1392,
        496
      ],
      "parameters": {
        "jsonSchemaExample": "{\n  \"summary\": \"\"\n}"
      },
      "typeVersion": 1.3
    },
    {
      "id": "1987c731-3ef5-4977-8029-8415726355ea",
      "name": "GPT 4o Mini - 1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1664,
        496
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini",
          "cachedResultName": "gpt-4o-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "2c9c2b61-56f7-40ef-912c-1a1bf34688a2",
      "name": "GPT 4o Mini - 2",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1184,
        496
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini",
          "cachedResultName": "gpt-4o-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "7fd1cabe-0429-4ca4-81ba-8b2e6344279b",
      "name": "GPT 4o - 1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -704,
        64
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o",
          "cachedResultName": "gpt-4o"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "95e48871-e554-442c-83fb-b028a020b342",
      "name": "GPT 4.1 Mini - 1",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -1184,
        80
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4.1-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "759b9362-6511-47a5-b96c-a7cb340bae16",
      "name": "Structured Output Parser - 3",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -912,
        496
      ],
      "parameters": {
        "jsonSchemaExample": "{\n\t\"summary\": \"Intel was founded in 1968.\"\n}\n"
      },
      "typeVersion": 1.3
    },
    {
      "id": "3b9a3797-81bd-4120-9c36-4d5d2c21ac01",
      "name": "AI-Powered Web Scraping to Google Sheets with Bright Data",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2160,
        -208
      ],
      "parameters": {
        "color": 6,
        "width": 396,
        "height": 860,
        "content": "# AI-Powered Web Scraping to Google Sheets with Bright Data\n\n## Created by: [Elay Guez](https://www.linkedin.com/in/elay-g)\n\n## Features\n* \ud83d\udcca **Custom Function** in Google Sheets: `=BRIGHTDATA(cell,\"prompt\")`\n* \ud83e\udd16 **AI-optimized** search queries for better results\n* \ud83d\udd0d **Bright Data** enterprise scraping (bypasses bot detection)\n* \ud83c\udfaf **AI-filtered** results return only the best match\n* \u26a1 **<20 second** response time\n\n## Quick Setup\n1. Connect OpenAI API key (GPT-4.1 + GPT-4o Mini + GPT 4o)\n2. Add Bright Data API credentials\n3. Deploy Google Apps Script custom function\n4. Configure webhook authentication\n5. Write preffered Language output in \"Set Variables\" Node\n6. Test & activate! \ud83d\ude80\n\n## Cost per search: ~$0.02-0.05\n## Time saved: 3-5 minutes per search"
      },
      "typeVersion": 1
    },
    {
      "id": "14a7088b-82e9-41b4-acbe-899b5eee195b",
      "name": "Sticky Note22",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1744,
        -208
      ],
      "parameters": {
        "color": 5,
        "width": 500,
        "height": 420,
        "content": "# 1) Get Query & Organize Data"
      },
      "typeVersion": 1
    },
    {
      "id": "97646605-6f57-49f1-9bd6-0afcf176bbe8",
      "name": "Sticky Note20",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1232,
        -208
      ],
      "parameters": {
        "color": 2,
        "width": 468,
        "height": 420,
        "content": "# 2) Suit Query to Brightdata"
      },
      "typeVersion": 1
    },
    {
      "id": "41e0d6fc-77ce-4541-8522-47f72d9d7782",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -752,
        -208
      ],
      "parameters": {
        "color": 4,
        "width": 724,
        "height": 420,
        "content": "# 3) Brightdata's Scraping "
      },
      "typeVersion": 1
    },
    {
      "id": "40feaec4-3243-4c39-be4a-dbd9430707c8",
      "name": "Sticky Note15",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1744,
        224
      ],
      "parameters": {
        "color": 3,
        "width": 1716,
        "height": 424,
        "content": "# 4) Infomation Extraction + Retrieving Summary & Updating Logs "
      },
      "typeVersion": 1
    },
    {
      "id": "f6fd6c4c-24b2-4e9f-a45d-9e84af36b0dd",
      "name": "Set Variables",
      "type": "n8n-nodes-base.set",
      "position": [
        -1424,
        -32
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "a4357b33-04b4-4246-ad30-f0d91b6687d2",
              "name": "userPrompt",
              "type": "string",
              "value": "={{ $json.body.source }}"
            },
            {
              "id": "977e4bde-3030-4d02-9c2f-3d46975901be",
              "name": "cellReference",
              "type": "string",
              "value": "={{ $json.body.prompt }}"
            },
            {
              "id": "c63d7766-11b9-4edb-92db-90e25535721b",
              "name": "ouputLanguage",
              "type": "string",
              "value": "Hebrew"
            }
          ]
        }
      },
      "typeVersion": 3.4
    }
  ],
  "connections": {
    "GPT 4o - 1": {
      "ai_languageModel": [
        [
          {
            "node": "Bright Data Search Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Extract Data": {
      "main": [
        [
          {
            "node": "Summarize Information",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Webhook Call": {
      "main": [
        [
          {
            "node": "Set Variables",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Variables": {
      "main": [
        [
          {
            "node": "Adjust Query Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Bright Data MCP": {
      "ai_tool": [
        [
          {
            "node": "Bright Data Search Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "GPT 4o Mini - 1": {
      "ai_languageModel": [
        [
          {
            "node": "Extract Data",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "GPT 4o Mini - 2": {
      "ai_languageModel": [
        [
          {
            "node": "Summarize Information",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "GPT 4.1 Mini - 1": {
      "ai_languageModel": [
        [
          {
            "node": "Adjust Query Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Adjust Query Agent": {
      "main": [
        [
          {
            "node": "Bright Data Search Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Summarize Information": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          },
          {
            "node": "Update Logs",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Bright Data Search Agent": {
      "main": [
        [
          {
            "node": "Bright Data - Data Extraction",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser - 1": {
      "ai_outputParser": [
        [
          {
            "node": "Bright Data Search Agent",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser - 2": {
      "ai_outputParser": [
        [
          {
            "node": "Extract Data",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser - 3": {
      "ai_outputParser": [
        [
          {
            "node": "Summarize Information",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "Bright Data - Data Extraction": {
      "main": [
        [
          {
            "node": "Extract Data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Transform any Google Sheets cell into an intelligent web scraper! Type and get AI-filtered result from every website in ~20 seconds.

Source: https://n8n.io/workflows/10119/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

This workflow transforms natural language queries into research reports through a five-stage AI pipeline. When triggered via webhook (typically from Google Sheets using the companion [](https://gist.g

Redis, Agent, Output Parser Structured +7
AI & RAG

CLINICAINTEGRAL_secretary. Uses postgres, mcpClientTool, googleDriveTool, toolWorkflow. Webhook trigger; 89 nodes.

Postgres, Mcp Client Tool, Google Drive Tool +14
AI & RAG

leads. Uses supabase, gmail, formTrigger, httpRequest. Webhook trigger; 62 nodes.

Supabase, Gmail, Form Trigger +13
AI & RAG

Tired of grinding out YouTube content? This n8n workflow turns AI into your personal video factory—creating engaging, faceless shorts on autopilot. Perfect for creators, marketers, or side-hustlers lo

HTTP Request, Google Drive, Google Sheets +6
AI & RAG

Faceless YouTube Generator. Uses httpRequest, limit, googleDrive, googleSheets. Webhook trigger; 49 nodes.

HTTP Request, Google Drive, Google Sheets +7