This workflow corresponds to n8n.io template #15133 — we link there as the canonical source.
This workflow follows the Agent → OpenAI Chat recipe pattern — see all workflows that pair these two integrations.
The workflow JSON
Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →
{
"meta": {
"templateCredsSetupCompleted": true
},
"nodes": [
{
"id": "1304913e-c8bb-4f3d-bcc8-37293280f12b",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
1152,
-256
],
"parameters": {
"width": 620,
"height": 800,
"content": "# Ticket Classifier Evaluation\n\n### How it works\n1. **Production path:** A webhook receives a support ticket, the AI Agent classifies it by category and urgency, and the structured result is returned via Respond to Webhook.\n2. **Evaluation path:** The Evaluation Trigger reads test cases from a Data Table and feeds each one through the same AI classification step.\n3. **Routing:** The Evaluating? node uses the Check if Evaluating operation to send production traffic downstream and evaluation traffic to the scoring branch.\n4. **Scoring:** Each predicted category and urgency is compared against the expected labels in the Data Table using exact match.\n5. **Recording:** Per-test-case scores and aggregate metrics are written to the Evaluations tab so you can compare runs over time.\n\n### Setup\n1. Add credentials for the OpenAI Chat Model (or swap in your preferred provider).\n2. Create the Data Table referenced by the Evaluation Trigger and seed it with real ticket inputs and expected labels (category, urgency).\n3. Open the Evaluations tab in this workflow and click Run Test to execute the evaluation suite.\n\n### Customization\n- Swap the classification taxonomy (category and urgency values) to match your support workflow.\n- Replace exact match scoring with similarity metrics if your labels are free-form.\n- Wire the production webhook to your real ticketing system (Zendesk, Intercom, HubSpot).\n- Seed the Data Table from your execution history so the test set reflects real usage patterns.\n\n---\nThis template is a learning companion to the **Production AI Playbook**, a series that explores strategies, shares best practices, and provides practical examples for building reliable AI systems in n8n."
},
"typeVersion": 1
},
{
"id": "133cfa05-e9c6-4c0e-b273-5074edd39f29",
"name": "Webhook - Receive Ticket",
"type": "n8n-nodes-base.webhook",
"position": [
2064,
80
],
"parameters": {
"path": "classify-ticket",
"options": {},
"httpMethod": "POST",
"responseMode": "responseNode"
},
"typeVersion": 2
},
{
"id": "415836bf-a899-47c0-a0cd-a86230cb7d34",
"name": "When fetching a dataset row",
"type": "n8n-nodes-base.evaluationTrigger",
"position": [
1840,
-112
],
"parameters": {
"source": "dataTable",
"dataTableId": {
"__rl": true,
"mode": "list",
"value": "ZD1uhLOIXKp3vte2",
"cachedResultUrl": "/projects/5xhYaLjYeyMka6t9/datatables/ZD1uhLOIXKp3vte2",
"cachedResultName": "Ticket Classifier Test Cases"
}
},
"typeVersion": 4.6
},
{
"id": "53e6d8e9-4ce1-4fca-8e5d-1c291f638661",
"name": "Format ticket input",
"type": "n8n-nodes-base.code",
"position": [
2064,
-112
],
"parameters": {
"jsCode": "const ticket = $input.first().json;\nreturn [{ json: { body: { ticket_text: ticket.input || ticket.ticket_text } } }];"
},
"typeVersion": 2
},
{
"id": "f76278c2-1835-4345-a214-3e19ab4f00b2",
"name": "AI Agent - Classify Ticket",
"type": "@n8n/n8n-nodes-langchain.agent",
"position": [
2288,
-16
],
"parameters": {
"text": "={{ $json.body.ticket_text }}",
"options": {
"systemMessage": "You are a support ticket classifier. Analyze the given support ticket and classify it.\n\nYou MUST respond with ONLY a valid JSON object in this exact format, with no additional text before or after:\n\n{\"category\": \"<category>\", \"urgency\": \"<urgency>\"}\n\nCategories (use exactly one):\n- billing\n- technical\n- sales\n- general\n\nUrgency levels (use exactly one):\n- low\n- normal\n- urgent\n\nRules:\n- Output ONLY the JSON object, nothing else\n- Use lowercase values exactly as shown above\n- Do not include any explanation or extra text"
},
"promptType": "define"
},
"typeVersion": 1.9
},
{
"id": "fa46a514-ea04-42ad-86bf-d611b182c92d",
"name": "OpenAI Chat Model",
"type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
"position": [
2368,
208
],
"parameters": {
"model": {
"__rl": true,
"mode": "list",
"value": "gpt-4o-mini",
"cachedResultName": "GPT-4O-MINI"
},
"options": {}
},
"credentials": {
"openAiApi": {
"name": "<your credential>"
}
},
"typeVersion": 1.2
},
{
"id": "078939ea-e98d-421c-99b0-8fd4c28b1fae",
"name": "Evaluating?",
"type": "n8n-nodes-base.evaluation",
"position": [
2640,
-16
],
"parameters": {
"operation": "checkIfEvaluating"
},
"typeVersion": 4.6
},
{
"id": "6d8c5efe-30ae-422d-81a8-e8da3543146c",
"name": "Respond to Webhook",
"type": "n8n-nodes-base.respondToWebhook",
"position": [
2864,
80
],
"parameters": {
"options": {},
"respondWith": "json",
"responseBody": "={{ $json }}"
},
"typeVersion": 1.1
},
{
"id": "6a594720-c311-4bb6-8907-a9846dfc6626",
"name": "Compare Classification",
"type": "n8n-nodes-base.code",
"position": [
2864,
-112
],
"parameters": {
"jsCode": "const aiOutput = $input.first().json;\nconst output = aiOutput.output || aiOutput;\nlet parsed = output;\nif (typeof output === 'string') { try { parsed = JSON.parse(output); } catch(e) { parsed = {}; } }\nconst expected = $(\"When fetching a dataset row\").first().json;\nconst categoryMatch = (parsed.category || '').toLowerCase() === (expected.expected_category || '').toLowerCase() ? 1 : 0;\nconst urgencyMatch = (parsed.urgency || '').toLowerCase() === (expected.expected_urgency || '').toLowerCase() ? 1 : 0;\nconst overallAccuracy = (categoryMatch + urgencyMatch) / 2;\nreturn [{ json: { category_match: categoryMatch, urgency_match: urgencyMatch, overall_accuracy: overallAccuracy, ai_category: parsed.category, ai_urgency: parsed.urgency, expected_category: expected.expected_category, expected_urgency: expected.expected_urgency } }];"
},
"typeVersion": 2
},
{
"id": "05cb862b-9ac2-46f2-84eb-2d6a90b227eb",
"name": "Evaluation - Set Outputs",
"type": "n8n-nodes-base.evaluation",
"position": [
3088,
-112
],
"parameters": {
"source": "dataTable",
"outputs": {
"values": [
{
"outputName": "category_match",
"outputValue": "={{ $json.category_match }}"
},
{
"outputName": "urgency_match",
"outputValue": "={{ $json.urgency_match }}"
},
{
"outputName": "overall_accuracy",
"outputValue": "={{ $json.overall_accuracy }}"
},
{
"outputName": "ai_category",
"outputValue": "={{ $json.ai_category }}"
},
{
"outputName": "ai_urgency",
"outputValue": "={{ $json.ai_urgency }}"
}
]
},
"dataTableId": {
"__rl": true,
"mode": "list",
"value": "ZD1uhLOIXKp3vte2",
"cachedResultUrl": "/projects/5xhYaLjYeyMka6t9/datatables/ZD1uhLOIXKp3vte2",
"cachedResultName": "Ticket Classifier Test Cases"
}
},
"typeVersion": 4.6
},
{
"id": "69e75f12-2f89-4f64-b6e1-651515e9fb0b",
"name": "Set Metrics",
"type": "n8n-nodes-base.evaluation",
"position": [
3312,
-112
],
"parameters": {
"metrics": {
"assignments": [
{
"id": "a1",
"name": "category_accuracy",
"type": "number",
"value": "={{ $json.category_match }}"
},
{
"id": "a2",
"name": "urgency_accuracy",
"type": "number",
"value": "={{ $json.urgency_match }}"
},
{
"id": "a3",
"name": "overall_accuracy",
"type": "number",
"value": "={{ $json.overall_accuracy }}"
}
]
},
"operation": "setMetrics"
},
"typeVersion": 4.6
},
{
"id": "757e4cd1-cf41-4e30-9dba-4199570c95f3",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
1792,
-256
],
"parameters": {
"color": 7,
"width": 768,
"height": 800,
"content": "## Classify Ticket"
},
"typeVersion": 1
},
{
"id": "95e19a38-7e10-4a60-98a0-88ab05536460",
"name": "Sticky Note3",
"type": "n8n-nodes-base.stickyNote",
"position": [
2592,
-256
],
"parameters": {
"color": 7,
"width": 896,
"height": 800,
"content": "## Evaluate Ticket Classification"
},
"typeVersion": 1
}
],
"connections": {
"Evaluating?": {
"main": [
[
{
"node": "Compare Classification",
"type": "main",
"index": 0
}
],
[
{
"node": "Respond to Webhook",
"type": "main",
"index": 0
}
]
]
},
"OpenAI Chat Model": {
"ai_languageModel": [
[
{
"node": "AI Agent - Classify Ticket",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Format ticket input": {
"main": [
[
{
"node": "AI Agent - Classify Ticket",
"type": "main",
"index": 0
}
]
]
},
"Compare Classification": {
"main": [
[
{
"node": "Evaluation - Set Outputs",
"type": "main",
"index": 0
}
]
]
},
"Evaluation - Set Outputs": {
"main": [
[
{
"node": "Set Metrics",
"type": "main",
"index": 0
}
]
]
},
"Webhook - Receive Ticket": {
"main": [
[
{
"node": "AI Agent - Classify Ticket",
"type": "main",
"index": 0
}
]
]
},
"AI Agent - Classify Ticket": {
"main": [
[
{
"node": "Evaluating?",
"type": "main",
"index": 0
}
]
]
},
"When fetching a dataset row": {
"main": [
[
{
"node": "Format ticket input",
"type": "main",
"index": 0
}
]
]
}
}
}
Credentials you'll need
Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.
openAiApi
For the full experience including quality scoring and batch install features for each workflow upgrade to Pro
About this workflow
Measure how well your AI classifier actually performs. This template shows how to evaluate a support ticket classifier using n8n's built-in evaluation system, comparing AI predictions against expected labels with exact match scoring.
Source: https://n8n.io/workflows/15133/ — original creator credit. Request a take-down →
Related workflows
Workflows that share integrations, category, or trigger type with this one. All free to copy and import.
This is a template for n8n's evaluation feature.
This template and YouTube video goes over 5 different implementations of evaluations within n8n. Categorization Correctness Tools used String similarity Helpfulness
Automate how you reply to Reddit posts using AI-generated, first-person comments that sound human, follow subreddit rules, and (optionally) promote your own links or products.
The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/answercorre
Developers building AI-powered workflows who want to ensure their agents work reliably. If you need to validate AI outputs, test agent behavior systematically, or build maintainable automation, this t