AutomationFlowsData & Sheets › Extract and Structure Thai Documents to Google Sheets Using Typhoon OCR and…

Extract and Structure Thai Documents to Google Sheets Using Typhoon OCR and…

Original n8n title: Extract and Structure Thai Documents to Google Sheets Using Typhoon OCR and Llama 3.1

ByJaruphat J. @jaruphatj on n8n.io

⚠️ Note: This template requires a community node and works only on self-hosted n8n installations. It uses the Typhoon OCR Python package and custom command execution. Make sure to install required dependencies locally.

Event trigger★★★★☆ complexityAI-powered8 nodesOpenRouter ChatExecute CommandChain LlmGoogle SheetsRead Write File
Data & Sheets Trigger: Event Nodes: 8 Complexity: ★★★★☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #4300 — we link there as the canonical source.

This workflow follows the Chainllm → Google Sheets recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "iPCOP0dstJZlKFQS",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "Typhoon_Submit",
  "tags": [],
  "nodes": [
    {
      "id": "7d7df2fd-bc12-4850-aa9a-3e318bfed747",
      "name": "When clicking \u2018Test workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        40,
        0
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "58290920-9b18-47dd-82b9-62c340b7ed53",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        600,
        120
      ],
      "parameters": {
        "model": "scb10x/llama3.1-typhoon2-70b-instruct",
        "options": {}
      },
      "credentials": {
        "openRouterApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "d42f2a8d-7bd5-462d-932c-34ee5de16592",
      "name": "Extract Text with Typhoon OCR",
      "type": "n8n-nodes-base.executeCommand",
      "position": [
        420,
        0
      ],
      "parameters": {
        "command": "=python -c \"import sys, os; os.environ['TYPHOON_OCR_API_KEY'] = '<YourTyphoonKey>'; from typhoon_ocr import ocr_document; sys.stdout.reconfigure(encoding='utf-8'); input_path = sys.argv[1]; text = ocr_document(input_path); print(text)\" \"doc/{{$json[\"fileName\"]}}\"",
        "executeOnce": false
      },
      "typeVersion": 1
    },
    {
      "id": "d6d13619-f704-45d0-85fc-3e5b9dc4bd7a",
      "name": "Structure Text to JSON with LLM",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        600,
        0
      ],
      "parameters": {
        "text": "=\u0e02\u0e49\u0e2d\u0e04\u0e27\u0e32\u0e21\u0e14\u0e49\u0e32\u0e19\u0e25\u0e48\u0e32\u0e07\u0e19\u0e35\u0e49\u0e40\u0e1b\u0e47\u0e19\u0e40\u0e19\u0e37\u0e49\u0e2d\u0e2b\u0e32 OCR \u0e08\u0e32\u0e01\u0e2b\u0e19\u0e31\u0e07\u0e2a\u0e37\u0e2d\u0e23\u0e32\u0e0a\u0e01\u0e32\u0e23 \u0e01\u0e23\u0e38\u0e13\u0e32\u0e41\u0e22\u0e01\u0e2b\u0e31\u0e27\u0e02\u0e49\u0e2d\u0e2a\u0e33\u0e04\u0e31\u0e0d\u0e2d\u0e2d\u0e01\u0e21\u0e32\u0e43\u0e19\u0e23\u0e39\u0e1b\u0e41\u0e1a\u0e1a JSON:\n\n1. book_id: \u0e40\u0e25\u0e02\u0e17\u0e35\u0e48\u0e2b\u0e19\u0e31\u0e07\u0e2a\u0e37\u0e2d\n2. date: \u0e27\u0e31\u0e19\u0e17\u0e35\u0e48\u0e43\u0e19\u0e40\u0e2d\u0e01\u0e2a\u0e32\u0e23\n3. subject: \u0e2b\u0e31\u0e27\u0e40\u0e23\u0e37\u0e48\u0e2d\u0e07\n4. to: \u0e40\u0e23\u0e35\u0e22\u0e19\n5. attach: \u0e2a\u0e34\u0e48\u0e07\u0e17\u0e35\u0e48\u0e2a\u0e48\u0e07\u0e21\u0e32\u0e14\u0e49\u0e27\u0e22\n6. detail: \u0e40\u0e19\u0e37\u0e49\u0e2d\u0e04\u0e27\u0e32\u0e21\u0e43\u0e19\u0e2b\u0e19\u0e31\u0e07\u0e2a\u0e37\u0e2d\n7. signed_by: \u0e1c\u0e39\u0e49\u0e25\u0e07\u0e19\u0e32\u0e21\n8. signed_by2: \u0e15\u0e33\u0e41\u0e2b\u0e19\u0e48\u0e07\u0e1c\u0e39\u0e49\u0e25\u0e07\u0e19\u0e32\u0e21\n9. contact: \u0e0a\u0e48\u0e2d\u0e07\u0e17\u0e32\u0e07\u0e15\u0e34\u0e14\u0e15\u0e48\u0e2d (\u0e40\u0e0a\u0e48\u0e19 \u0e40\u0e1a\u0e2d\u0e23\u0e4c\u0e42\u0e17\u0e23 \u0e2d\u0e35\u0e40\u0e21\u0e25)\n10. download_url: \u0e25\u0e34\u0e07\u0e01\u0e4c\u0e2a\u0e33\u0e2b\u0e23\u0e31\u0e1a\u0e14\u0e32\u0e27\u0e19\u0e4c\u0e42\u0e2b\u0e25\u0e14 (\u0e16\u0e49\u0e32\u0e21\u0e35)\n\nOCR_TEXT:\n\"\"\"\n{{ $json[\"stdout\"] }}\n\"\"\"",
        "promptType": "define"
      },
      "typeVersion": 1.6
    },
    {
      "id": "eaa9580c-9b98-42dc-a164-38430df41459",
      "name": "Parse JSON to Sheet Format",
      "type": "n8n-nodes-base.code",
      "position": [
        940,
        0
      ],
      "parameters": {
        "mode": "runOnceForEachItem",
        "jsCode": "const raw = $json[\"text\"];\n\n// 1. \u0e25\u0e1a ```json \u0e41\u0e25\u0e30 ``` \u0e17\u0e35\u0e48 LLM \u0e2d\u0e32\u0e08\u0e43\u0e2a\u0e48\u0e21\u0e32\nconst cleaned = raw.replace(/```json\\n?|```/g, \"\").trim();\n\nlet parsed;\ntry {\n  // 2. \u0e41\u0e1b\u0e25\u0e07\u0e40\u0e1b\u0e47\u0e19 object\n  parsed = JSON.parse(cleaned);\n} catch (err) {\n  throw new Error(\"JSON parsing failed: \" + err.message + \"\\n\\nRaw text:\\n\" + cleaned);\n}\n\n// 3. \u0e2b\u0e32\u0e01 contact \u0e40\u0e1b\u0e47\u0e19 object \u0e41\u0e22\u0e01 field \u0e2d\u0e2d\u0e01\u0e21\u0e32\nconst contact = parsed.contact || {};\n\nreturn {\n  book_id: parsed.book_id || \"\",\n  date: parsed.date || \"\",\n  subject: parsed.subject || \"\",\n  to: parsed.to || \"\",\n  attach: parsed.attach || \"\",\n  detail: parsed.detail || \"\",\n  signed_by: parsed.signed_by || \"\",\n  signed_by2: parsed.signed_by2 || \"\",\n  contact_phone: contact.phone || \"\",\n  contact_email: contact.email || \"\",\n  contact_fax: contact.fax || \"\",\n  download_url: parsed.download_url || \"\"\n};\n"
      },
      "typeVersion": 2
    },
    {
      "id": "8e3e5e7e-d329-459e-bd85-31fc2f76144b",
      "name": "Save to Google Sheet",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1120,
        0
      ],
      "parameters": {
        "columns": {
          "value": {
            "to": "={{ $json.to }}",
            "date": "={{ $json.date }}",
            "attach": "={{ $json.attach }}",
            "detail": "={{ $json.detail }}",
            "book_id": "={{ $json.book_id }}",
            "subject": "={{ $json.subject }}",
            "signed_by": "={{ $json.signed_by }}",
            "signed_by2": "={{ $json.signed_by2 }}",
            "contact_fax": "={{ $json.contact_fax }}",
            "download_url": "={{ $json.download_url }}",
            "contact_email": "={{ $json.contact_email }}",
            "contact_phone": "={{ $json.contact_phone }}"
          },
          "schema": [
            {
              "id": "book_id",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "book_id",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "date",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "date",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "subject",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "subject",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "to",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "to",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "attach",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "attach",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "detail",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "detail",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "signed_by",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "signed_by",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "signed_by2",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "signed_by2",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "contact_phone",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "contact_phone",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "contact_email",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "contact_email",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "contact_fax",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "contact_fax",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "download_url",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "download_url",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "book_id"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "append",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1h70cJyLj5i2j0Ag5kqp93ccZjjhJnqpLmz-ee5r4brU/edit#gid=0",
          "cachedResultName": "Sheet1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1h70cJyLj5i2j0Ag5kqp93ccZjjhJnqpLmz-ee5r4brU",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1h70cJyLj5i2j0Ag5kqp93ccZjjhJnqpLmz-ee5r4brU/edit?usp=drivesdk",
          "cachedResultName": "TyphoonOCR_Extracted_Data"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.5
    },
    {
      "id": "7b20ef0a-5d1f-4efe-bac3-53ace280cac2",
      "name": "Load PDFs from doc Folder",
      "type": "n8n-nodes-base.readWriteFile",
      "position": [
        220,
        0
      ],
      "parameters": {
        "options": {},
        "fileSelector": "doc/*"
      },
      "typeVersion": 1,
      "alwaysOutputData": true
    },
    {
      "id": "944977eb-2db6-430f-957e-345541ba8d39",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        0,
        -100
      ],
      "parameters": {
        "width": 1320,
        "height": 360,
        "content": "## Thai OCR to Sheet\nThis workflow extracts Thai PDF text using typhoon-ocr, converts it to structured JSON using LLM, and saves the output to Google Sheets. Works with self-hosted n8n only."
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "dc1bd760-0abe-4125-b2f6-2eeb4d9b02eb",
  "connections": {
    "OpenRouter Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "Structure Text to JSON with LLM",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Load PDFs from doc Folder": {
      "main": [
        [
          {
            "node": "Extract Text with Typhoon OCR",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Parse JSON to Sheet Format": {
      "main": [
        [
          {
            "node": "Save to Google Sheet",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Text with Typhoon OCR": {
      "main": [
        [
          {
            "node": "Structure Text to JSON with LLM",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Structure Text to JSON with LLM": {
      "main": [
        [
          {
            "node": "Parse JSON to Sheet Format",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \u2018Test workflow\u2019": {
      "main": [
        [
          {
            "node": "Load PDFs from doc Folder",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

⚠️ Note: This template requires a community node and works only on self-hosted n8n installations. It uses the Typhoon OCR Python package and custom command execution. Make sure to install required dependencies locally.

Source: https://n8n.io/workflows/4300/ — original creator credit. Request a take-down →

More Data & Sheets workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Data & Sheets

This template is designed for developers, back-office teams, and automation builders (especially in Thailand or Thai-speaking environments) who need to process multi-file, multi-page Thai PDFs and aut

Execute Command, Read Write File, Chain Llm +2
Data & Sheets

This template is ideal for B2B founders, solopreneurs, growth marketers, SDRs, or anyone looking to scale their lead generation and enrichment with no-code tools to low-code tools.

Google Sheets, OpenRouter Chat, Chain Llm +4
Data & Sheets

This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet.

Google Sheets, Output Parser Structured, Chain Llm +3
Data & Sheets

This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet.

OpenRouter Chat, Google Drive, Google Sheets +2
Data & Sheets

This workflow is designed to automate the generation and updating of SEO meta titles and descriptions for WooCommerce products using n8n. It leverages Google Sheets for data input, a FREE language mod

Output Parser Structured, Google Sheets, WooCommerce +2