AutomationFlowsGeneral › Extract Entities from Web Pages

Extract Entities from Web Pages

Original n8n title: Google Page Entity Extraction Template

Google Page Entity Extraction Template. Uses respondToWebhook, httpRequest, stickyNote. Webhook trigger; 6 nodes.

Webhook trigger★★★★☆ complexity6 nodesHTTP Request
General Trigger: Webhook Nodes: 6 Complexity: ★★★★☆ Added:

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "4wPgPbxtojrUO7Dx",
  "name": "Google Page Entity Extraction Template",
  "tags": [
    {
      "id": "hBkrfz3jN0GbUgJa",
      "name": "Google Page Entity Extraction Template",
      "createdAt": "2025-05-08T23:29:39.011Z",
      "updatedAt": "2025-05-08T23:29:39.011Z"
    }
  ],
  "nodes": [
    {
      "id": "8719f1de-2a3e-4c34-9edc-e4b8f993b525",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        1240,
        -420
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.1
    },
    {
      "id": "01420fd5-3483-4e74-b9fc-971199898449",
      "name": "Google Entities",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1020,
        -420
      ],
      "parameters": {
        "url": "https://language.googleapis.com/v1/documents:analyzeEntities",
        "method": "POST",
        "options": {},
        "jsonBody": "={{ $json.apiRequest }}",
        "sendBody": true,
        "sendQuery": true,
        "sendHeaders": true,
        "specifyBody": "json",
        "queryParameters": {
          "parameters": [
            {
              "name": "key",
              "value": "YOUR-GOOGLE-API-KEY"
            }
          ]
        },
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "5c1c258a-44ed-4d5a-a22d-cddb4df09018",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -300,
        -700
      ],
      "parameters": {
        "color": 4,
        "width": 620,
        "height": 880,
        "content": "# Google Page Entity Extraction Template\n\n## What this workflow does\nThis workflow allows you to extract named entities (people, organizations, locations, etc.) from any web page using Google's Natural Language API. Simply send a URL to the webhook endpoint, and the workflow will fetch the page content, process it through Google's entity recognition service, and return the structured entity data.\n\n### How to use\n1. Replace \"YOUR-GOOGLE-API-KEY\" with your actual Google Cloud API key (Natural Language API must be enabled)\n2. Activate the workflow and use the webhook URL as your endpoint\n3. Send a POST request to the webhook with a JSON body containing the URL you want to analyze: {\"url\": \"https://example.com/page\"}\n4. Review the returned entity analysis with categories, salience scores, and metadata\n\n## Webhook Input Format\nThe webhook expects a POST request with a JSON body in this format:\n```json\n{\n  \"url\": \"https://website-to-analyze.com/page\"\n}\n```\n### Response Format\nThe webhook returns a JSON response containing the full entity analysis from Google's Natural Language API, including:\n\nEntity names and types (PERSON, LOCATION, ORGANIZATION, etc.)\nSalience scores indicating entity importance\nMetadata and mentions within the text\nEntity sentiment (if available)"
      },
      "typeVersion": 1
    },
    {
      "id": "79add9a7-adca-4ce5-8a6a-5fcb75288846",
      "name": "Get Url",
      "type": "n8n-nodes-base.webhook",
      "position": [
        360,
        -420
      ],
      "parameters": {
        "path": "2944c8f6-03cd-4ab8-8b8e-cb033edf877a",
        "options": {},
        "httpMethod": "POST",
        "responseMode": "responseNode"
      },
      "typeVersion": 2
    },
    {
      "id": "081a52bc-2da7-44fb-bdc3-4cb73cbf8dd3",
      "name": "Get URL Page Contents",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        580,
        -420
      ],
      "parameters": {
        "url": "={{ $json.body.url }}",
        "options": {}
      },
      "typeVersion": 4.2
    },
    {
      "id": "dda5ef3d-f031-4dd6-b117-c1f69aa66b63",
      "name": "Respond with detected entities",
      "type": "n8n-nodes-base.code",
      "position": [
        800,
        -420
      ],
      "parameters": {
        "jsCode": "// Clean and prepare HTML for API request\nconst html = $input.item.json.data;\n// Trim if too large (optional)\nconst trimmedHtml = html.length > 100000 ? html.substring(0, 100000) : html;\n\nreturn {\n  json: {\n    apiRequest: {\n      document: {\n        type: \"HTML\",\n        content: trimmedHtml\n      },\n      encodingType: \"UTF8\"\n    }\n  }\n}"
      },
      "typeVersion": 2
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "432203af-190a-4a89-81d8-f86682a0b63f",
  "connections": {
    "Get Url": {
      "main": [
        [
          {
            "node": "Get URL Page Contents",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Google Entities": {
      "main": [
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get URL Page Contents": {
      "main": [
        [
          {
            "node": "Respond with detected entities",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Respond with detected entities": {
      "main": [
        [
          {
            "node": "Google Entities",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Google Page Entity Extraction Template. Uses respondToWebhook, httpRequest, stickyNote. Webhook trigger; 6 nodes.

Source: https://github.com/Zie619/n8n-workflows — original creator credit. Request a take-down →

More General workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

General

PUQ Docker NextCloud deploy. Uses respondToWebhook, stickyNote, httpRequest, ssh. Webhook trigger; 44 nodes.

HTTP Request, Ssh
General

Analyze_email_headers_for_IPs_and_spoofing__3. Uses stickyNote, respondToWebhook, itemLists, httpRequest. Webhook trigger; 35 nodes.

Item Lists, HTTP Request
General

Unique QRcode coupon assignment and validation for Lead Generation system. Uses httpRequest, formTrigger, googleSheets, stickyNote. Webhook trigger; 29 nodes.

HTTP Request, Form Trigger, Google Sheets +1
General

Line Save File to Google Drive and Log File's URL. Uses googleSheets, googleDrive, httpRequest, stickyNote. Webhook trigger; 27 nodes.

Google Sheets, Google Drive, HTTP Request
General

Webhook Respondtowebhook. Uses executeWorkflow, httpRequest, stickyNote, respondToWebhook. Webhook trigger; 23 nodes.

HTTP Request