Evaluation n8n workflows (29)

Most-used Evaluation workflows

Evaluate AI Workflows Using Google Sheets, Gemini, Claude, Gpt, and Perplexity (64 nodes)
My Solution for the "agentic Arena Community Contest" (rag, Qdrant, Mistral Ocr) — n8n Evaluation workflow (41 nodes)
Faqs Embeddings (35 nodes)
Route and Qualify Email Leads with Gmail, Gemini, Slack, Sheets and Salesforce — n8n Evaluation workflow (35 nodes)
Route Event Sales Leads with Gmail, Google Gemini, Sheets and Salesforce (35 nodes)
Faqs Embeddings (google Docs) — n8n Evaluation workflow (35 nodes)
Automate Reddit Replies with F5bot Alerts & Gpt-5 Personalized Comments (31 nodes)
Custom Discord Notifications for Radarr, Sonarr, Bazarr Etc. — n8n Evaluation workflow (28 nodes)
Evaluate AI Agent Response Correctness with Openai and Ragas Methodology (27 nodes)
Evaluation Metric Example: RAG Document Relevance — n8n Evaluation workflow (26 nodes)

AI & RAG

Evaluate AI Workflows Using Google Sheets, Gemini, Claude, Gpt, and Perplexity

This template and YouTube video goes over 5 different implementations of evaluations within n8n. Categorization Correctness Tools used String similarity Helpfulness

Evaluation, Evaluation Trigger, Google Gemini Chat +8

AI & RAG

My Solution for the "agentic Arena Community Contest" (rag, Qdrant, Mistral Ocr)

🤖📈 This workflow is my personal solution for the Agentic Arena Community Contest, where the goal is to build a Retrieval-Augmented Generation (RAG) AI agent capable of answering questions based on a p

Evaluation, Evaluation Trigger, Chat +11

AI & RAG

Faqs Embeddings

FAQs Embeddings. Uses googleDocs, openAi, supabase, httpRequest. Event-driven trigger; 35 nodes.

Google Docs, OpenAI, Supabase +8

CRM & Sales

Route and Qualify Email Leads with Gmail, Gemini, Slack, Sheets and Salesforce

Who is this for? Event sales teams & conference organizers processing 100+ sponsor/partner emails weekly who need instant lead qualification, Salesforce automation, & pipeline analytics. _

Sentiment Analysis, Gmail, Gmail Trigger +8

CRM & Sales

Route Event Sales Leads with Gmail, Google Gemini, Sheets and Salesforce

Email Sentiment Router for Event Sales Leads

Sentiment Analysis, Gmail, Gmail Trigger +8

AI & RAG

Faqs Embeddings (google Docs)

FAQs Embeddings. Uses googleDocs, openAi, supabase, httpRequest. Event-driven trigger; 35 nodes.

Google Docs, OpenAI, Supabase +8

AI & RAG

Automate Reddit Replies with F5bot Alerts & Gpt-5 Personalized Comments

Automate how you reply to Reddit posts using AI-generated, first-person comments that sound human, follow subreddit rules, and (optionally) promote your own links or products.

Gmail Trigger, Reddit, Agent +5

Slack & Telegram

Custom Discord Notifications for Radarr, Sonarr, Bazarr Etc.

This is a simple temlate that will allow you to customise the notifications in Radarr, Sonarr, Bazarr and similar. By default the notifications are configured to be sent to discord and look similar to

HTTP Request, Evaluation, Evaluation Trigger

AI & RAG

Evaluate AI Agent Response Correctness with Openai and Ragas Methodology

The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/answercorre

Chain Llm, Output Parser Structured, OpenAI Chat +5

AI & RAG

Evaluation Metric Example: RAG Document Relevance

This is a template for n8n's evaluation feature.

Evaluation Trigger, Evaluation, Chat Trigger +8

AI & RAG

Evaluate RAG Response Accuracy with Openai: Document Groundedness Metric

The scoring approach is adapted from https://cloud.google.com/vertex-ai/generative-ai/docs/models/metrics-templates#pointwise_groundedness This evaluation works best for an agent that requires documen

HTTP Request, In-Memory Vector Store, OpenAI Embeddings +9

AI & RAG

Extract Meeting Details with Gpt-4.1-mini and Evaluate Accuracy in Google Sheets

Developers building AI-powered workflows who want to ensure their agents work reliably. If you need to validate AI outputs, test agent behavior systematically, or build maintainable automation, this t

Execute Workflow Trigger, Evaluation Trigger, Output Parser Structured +4

AI & RAG

Evaluations Metric: Answer Similarity

OpenAI Chat, Evaluation Trigger, Evaluation +3

AI & RAG

Self-learning Faqs RAG

Self-Learning FAQs RAG. Uses googleDocs, openAi, supabase, httpRequest. Event-driven trigger; 21 nodes.

Google Docs, OpenAI, Supabase +6

AI & RAG

Sales Lead Routing with Gemini Sentiment Analysis & Model Evaluation Framework

This n8n template demonstrates how to deploy an AI workflow in production while simultaneously running a robust, data-driven Evaluation Framework to ensure quality and optimize costs.

Sentiment Analysis, Gmail, Gmail Trigger +3

AI & RAG

Evaluate AI Agent Response Relevance Using Openai and Cosine Similarity

OpenAI Chat, Evaluation Trigger, Evaluation +5

AI & RAG

Monitor AI Quality Drift with Gpt-4o-mini Evaluations and Slack Alerts

Catch AI quality drift before your users do. This template ties scheduled evaluation, LLM-as-a-Judge scoring, and threshold-based alerts into a continuous monitoring loop that fires a Slack alert the

Evaluation Trigger, Agent, OpenAI Chat +3

AI & RAG

Evaluation Metric: Summarization

The scoring approach is adapted from https://cloud.google.com/vertex-ai/generative-ai/docs/models/metrics-templates#pointwisesummarizationquality This evaluation works best for an AI summarization wor

Evaluation, OpenAI Chat, Evaluation Trigger +4

AI & RAG

Evaluate Tool Usage Accuracy in Multi-agent AI Workflows Using Evaluation Nodes

Who's it for

Tool Calculator, Chat Trigger, Evaluation Trigger +7

AI & RAG

Evaluation Metric Example: Check If Tool Was Called

This is a template for n8n's evaluation feature.

Agent, OpenAI Chat, Tool Calculator +4

AI & RAG

🎓 Learn Evaluate Tool. Tutorial for Beginners with Gemini and Google Sheets

This workflow is a beginner-friendly tutorial demonstrating how to use the Evaluation tool to automatically score the AI’s output against a known correct answer (“ground truth”) stored in a Google She

Evaluation Trigger, Agent, Tool Calculator +2

AI & RAG

Eval_guardrails_mode_check_text_for_violations

eval_Guardrails_mode_check_text_for_violations. Uses chatTrigger, agent, lmChatOpenAi, googleSheetsTool. Chat trigger; 15 nodes.

Chat Trigger, Agent, OpenAI Chat +7

AI & RAG

Score Customer Support AI Responses with Gpt‑4 Judge Metrics

Score open-ended AI responses with a judge model. This template shows how to evaluate a customer support agent using a separate LLM that rates each response on correctness and helpfulness, going beyon

Chat Trigger, Evaluation Trigger, Agent +3

AI & RAG

Evaluation Metric Example: Correctness (judged by Ai)

This is a template for n8n's evaluation feature.

OpenAI Chat, Evaluation Trigger, Evaluation +3

AI & RAG

Evaluation Metric Example: Categorization

This is a template for n8n's evaluation feature.

Agent, OpenAI Chat, Output Parser Structured +2

AI & RAG

Evaluate a Support Ticket Classifier with Openai Gpt-4o-mini and N8n Evaluations

Measure how well your AI classifier actually performs. This template shows how to evaluate a support ticket classifier using n8n's built-in evaluation system, comparing AI predictions against expected

Evaluation Trigger, Agent, OpenAI Chat +1

AI & RAG

Evaluation_guardrails_mode_check_text_for_violations

evaluation_Guardrails_mode_check_text_for_violations. Uses chatTrigger, agent, lmChatOpenAi, googleSheetsTool. Chat trigger; 13 nodes.

Chat Trigger, Agent, OpenAI Chat +7

AI & RAG

Evaluation Metric Example: String Similarity

This is a template for n8n's evaluation feature.

Evaluation Trigger, Evaluation, OpenAI +1

AI & RAG

Evaluation

evaluation. Uses formTrigger, lmChatOpenAi, agent, form. Event-driven trigger; 10 nodes.

Form Trigger, OpenAI Chat, Agent +3

29 of 29 workflows in this view · Browse all →

FAQ

How many n8n Evaluation workflows are in the catalog?

29 n8n workflows in AutomationFlows currently use the Evaluation integration — triggers, actions, or both.

How do I connect Evaluation in n8n?

After importing the workflow JSON, n8n will prompt for Evaluation credentials on the relevant nodes. AutomationFlows strips credential IDs before publishing — you'll add your own.

Can I combine these with other integrations?

Yes — most Evaluation workflows pair with adjacent tools (Slack alerts, Google Sheets logging, OpenAI summarisation). Browse the integration tags on each workflow page to discover pairings.

n8n workflows for Evaluation.

Most-used Evaluation workflows

FAQ