AutomationFlowsAI & RAG › Evaluation Metric Example: Categorization

Evaluation Metric Example: Categorization

ByDavid Roberts @davidn8n on n8n.io

This is a template for n8n's evaluation feature.

Webhook trigger★★★★☆ complexityAI-powered13 nodesAgentOpenAI ChatOutput Parser StructuredEvaluation TriggerEvaluation
AI & RAG Trigger: Webhook Nodes: 13 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #4269 — we link there as the canonical source.

This workflow follows the Agent → OpenAI Chat recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "88b3ca67-d7d6-4e6d-9529-d81377315c05",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        180,
        -20
      ],
      "parameters": {
        "color": 7,
        "width": 180,
        "height": 260,
        "content": "Check whether the category/priority matches the expected one in the dataset"
      },
      "typeVersion": 1
    },
    {
      "id": "25d9ced2-d91c-486a-bbc9-481258c716e5",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1200,
        40
      ],
      "parameters": {
        "width": 200,
        "height": 500,
        "content": "## How it works\nThis template shows how to calculate a workflow evaluation metric: **whether a category matches the expected one**.\n\nThe workflow takes support tickets and generates a category and priority, which is then compared with the correct answers in the dataset.\n\nYou can find more information on workflow evaluation [here](https://docs.n8n.io/advanced-ai/evaluations/overview), and other metric examples [here](https://docs.n8n.io/advanced-ai/evaluations/metric-based-evaluations/#2-calculate-metrics)."
      },
      "typeVersion": 1
    },
    {
      "id": "c1f452ac-31e6-4a8c-9558-92ba83a34464",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -960,
        280
      ],
      "parameters": {
        "color": 7,
        "width": 220,
        "height": 220,
        "content": "Read in [this test dataset](https://docs.google.com/spreadsheets/d/1uuPS5cHtSNZ6HNLOi75A2m8nVWZrdBZ_Ivf58osDAS8/edit?gid=294497137#gid=294497137) of support tickets"
      },
      "typeVersion": 1
    },
    {
      "id": "79375ec3-713a-4d07-8701-2c8d3392d6c5",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        -420,
        200
      ],
      "parameters": {
        "text": "=Subject: {{ $json.subject }}\nBody: {{ $json.body }}",
        "options": {
          "systemMessage": "=You are a support triage assistant.\nGiven the subject and body of a customer support email, return the category and priority based on the following classification:\n\n### Categories:\nBug Report\nFeature Request\nUsage Question\nAccount/Billing\nOutage/Emergency\nIntegration Issue\nFeedback/Compliment\nOther/Uncategorized\n\n### Priorities:\nLow, Medium, High, Urgent\n\nReturn your output as JSON in this format:\n{\n  \"category\": \"<category>\",\n  \"priority\": \"<priority>\"\n}\n\n### Example \nExample input:\nSubject: OAuth not working with Salesforce node\nBody: I'm trying to connect to Salesforce using OAuth2, but I keep getting a token error. I followed the docs exactly.\n\nExpected output:\n{\n  \"category\": \"Integration Issue\",\n  \"priority\": \"High\"\n}",
          "returnIntermediateSteps": true
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 1.9
    },
    {
      "id": "93932809-9b1e-47a2-a50d-0b0978d0b881",
      "name": "OpenAI Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        -440,
        420
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o-mini"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "5d8383f2-e0ea-4fd9-bdcb-0f04e1872896",
      "name": "Structured Output Parser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        -240,
        420
      ],
      "parameters": {
        "jsonSchemaExample": "{\n  \"category\": \"<category>\",\n  \"priority\": \"<priority>\"\n}"
      },
      "typeVersion": 1.2
    },
    {
      "id": "e0d259ea-a9b7-45db-a4fb-fa2a2e5769b3",
      "name": "Check categorization",
      "type": "n8n-nodes-base.set",
      "position": [
        220,
        80
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "58c48e6f-4a12-4bf6-94ae-705244273a84",
              "name": "category_match",
              "type": "boolean",
              "value": "={{ $json.output.category == $('When fetching a dataset row').item.json.expected_category }}"
            },
            {
              "id": "23959e14-6026-4bd5-b28c-12ab529f21de",
              "name": "priority_match",
              "type": "boolean",
              "value": "={{ $json.output.priority == $('When fetching a dataset row').item.json.expected_priority }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "1ffc9841-a146-4127-a957-69cb73f7e6f8",
      "name": "Match webhook format",
      "type": "n8n-nodes-base.set",
      "position": [
        -680,
        340
      ],
      "parameters": {
        "mode": "raw",
        "options": {},
        "jsonOutput": "=  {\n    \"headers\": {\n    },\n    \"params\": {},\n    \"query\": {\n      \"subject\": {{ $json.subject.toJsonString() }},\n      \"body\": {{ $json.body.toJsonString() }}\n    },\n    \"body\": {},\n    \"executionMode\": \"test\"\n  }"
      },
      "typeVersion": 3.4
    },
    {
      "id": "7505b06e-0179-4e8a-a830-8631be7a9ca9",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "position": [
        -900,
        40
      ],
      "parameters": {
        "path": "fbe73ea5-bb42-4e3b-9eea-1b6535248eef",
        "options": {}
      },
      "typeVersion": 2
    },
    {
      "id": "5090592c-b883-4644-a8d5-74ada2c4863e",
      "name": "When fetching a dataset row",
      "type": "n8n-nodes-base.evaluationTrigger",
      "position": [
        -900,
        340
      ],
      "parameters": {
        "sheetName": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/1uuPS5cHtSNZ6HNLOi75A2m8nVWZrdBZ_Ivf58osDAS8/edit?gid=294497137#gid=294497137"
        },
        "documentId": {
          "__rl": true,
          "mode": "url",
          "value": "https://docs.google.com/spreadsheets/d/1uuPS5cHtSNZ6HNLOi75A2m8nVWZrdBZ_Ivf58osDAS8/edit?gid=294497137#gid=294497137"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.6
    },
    {
      "id": "30ecf24a-2012-4751-a7fd-712fa498ac46",
      "name": "Respond to Webhook",
      "type": "n8n-nodes-base.respondToWebhook",
      "position": [
        220,
        320
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1.3
    },
    {
      "id": "a6293ef9-fd50-439e-a906-c9e03e97d0bc",
      "name": "Evaluating?",
      "type": "n8n-nodes-base.evaluation",
      "position": [
        -60,
        200
      ],
      "parameters": {
        "operation": "checkIfEvaluating"
      },
      "typeVersion": 4.6
    },
    {
      "id": "db1b0fac-0268-44c3-87ce-af552bb6c42d",
      "name": "Set metrics",
      "type": "n8n-nodes-base.evaluation",
      "position": [
        440,
        80
      ],
      "parameters": {
        "metrics": {
          "assignments": [
            {
              "id": "0e507b06-e6d5-4ace-aa22-f06c6db5b883",
              "name": "category_match",
              "type": "number",
              "value": "={{ $json.category_match.toNumber() }}"
            },
            {
              "id": "44918a17-d8b9-41e1-b336-3ee67b86b528",
              "name": "priority_match",
              "type": "number",
              "value": "={{ $json.priority_match.toNumber() }}"
            }
          ]
        },
        "operation": "setMetrics"
      },
      "typeVersion": 4.6
    }
  ],
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "AI Agent": {
      "main": [
        [
          {
            "node": "Evaluating?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Evaluating?": {
      "main": [
        [
          {
            "node": "Check categorization",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Respond to Webhook",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Check categorization": {
      "main": [
        [
          {
            "node": "Set metrics",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Match webhook format": {
      "main": [
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser": {
      "ai_outputParser": [
        [
          {
            "node": "AI Agent",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "When fetching a dataset row": {
      "main": [
        [
          {
            "node": "Match webhook format",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This is a template for n8n's evaluation feature.

Source: https://n8n.io/workflows/4269/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

Automate how you reply to Reddit posts using AI-generated, first-person comments that sound human, follow subreddit rules, and (optionally) promote your own links or products.

Gmail Trigger, Reddit, Agent +5
AI & RAG

The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/answercorre

Chain Llm, Output Parser Structured, OpenAI Chat +5
AI & RAG

Developers building AI-powered workflows who want to ensure their agents work reliably. If you need to validate AI outputs, test agent behavior systematically, or build maintainable automation, this t

Execute Workflow Trigger, Evaluation Trigger, Output Parser Structured +4
AI & RAG

The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/answerrelev

OpenAI Chat, Evaluation Trigger, Evaluation +5
AI & RAG

Measure how well your AI classifier actually performs. This template shows how to evaluate a support ticket classifier using n8n's built-in evaluation system, comparing AI predictions against expected

Evaluation Trigger, Agent, OpenAI Chat +1