AutomationFlowsWeb Scraping › Extract and Convert PDF Documents to Markdown with Llamaindex Cloud API

Extract and Convert PDF Documents to Markdown with Llamaindex Cloud API

Byfranck fambou @franck-f on n8n.io

This workflow automatically converts PDF documents to Markdown format using the LlamaIndex Cloud API. LlamaIndex is a powerful data framework that specializes in connecting large language models with external data sources, offering advanced document processing capabilities with…

Event trigger★★★★☆ complexity7 nodesHTTP RequestForm Trigger
Web Scraping Trigger: Event Nodes: 7 Complexity: ★★★★☆ Added:

This workflow corresponds to n8n.io template #7164 — we link there as the canonical source.

This workflow follows the Form Trigger → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "rnBhMZpP6vRnoQ8Z",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "transform pdf tot markdown",
  "tags": [],
  "nodes": [
    {
      "id": "5653ad40-2f56-4e18-be4b-04b4dd10b107",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        816,
        -256
      ],
      "parameters": {
        "amount": 30
      },
      "typeVersion": 1.1
    },
    {
      "id": "ef584ef5-1aeb-434a-9ed2-f837a329486f",
      "name": "Upload_doc",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        624,
        -256
      ],
      "parameters": {
        "url": "https://api.cloud.llamaindex.ai/api/v1/parsing/upload",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "file",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "file"
            }
          ]
        },
        "genericAuthType": "httpBearerAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            },
            {
              "name": "Content-Type",
              "value": "multipart/form-data"
            }
          ]
        }
      },
      "credentials": {
        "httpBearerAuth": {
          "name": "<your credential>"
        },
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "63c299af-dae8-4dd5-a4a9-159302c9a18a",
      "name": "If",
      "type": "n8n-nodes-base.if",
      "position": [
        1232,
        -256
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "7a07aec1-fc5f-4b76-94d9-6fa8f509ac8e",
              "operator": {
                "name": "filter.operator.equals",
                "type": "string",
                "operation": "equals"
              },
              "leftValue": "={{ $json.status }}",
              "rightValue": "SUCCESS"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "3bf86ec3-5cbe-4581-a615-528c0e6b783b",
      "name": "On form submission",
      "type": "n8n-nodes-base.formTrigger",
      "position": [
        416,
        -256
      ],
      "parameters": {
        "options": {},
        "formTitle": "Upload file",
        "formFields": {
          "values": [
            {
              "fieldType": "file",
              "fieldLabel": "File",
              "acceptFileTypes": "pdf"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "f451ade2-fccd-4636-b11c-7971b9d32b5e",
      "name": "Wait2",
      "type": "n8n-nodes-base.wait",
      "position": [
        1504,
        -112
      ],
      "parameters": {
        "amount": 60
      },
      "typeVersion": 1.1
    },
    {
      "id": "8f7d2640-7fd4-425b-bb03-bb71b73a7dc3",
      "name": "Status Verification",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1008,
        -256
      ],
      "parameters": {
        "url": "=https://api.cloud.llamaindex.ai/api/parsing/job/{{ $('Upload_doc').item.json.id }}",
        "options": {},
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpBearerAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpBearerAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "0c488546-7ec6-4c95-bc18-255d792705d4",
      "name": "Content extraction",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1504,
        -304
      ],
      "parameters": {
        "url": "=https://api.cloud.llamaindex.ai/api/v1/parsing/job/{{ $json.id }}/result/markdown",
        "options": {},
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpBearerAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpBearerAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "a0fb030f-88ae-4c19-8662-b83f55dbeced",
  "connections": {
    "If": {
      "main": [
        [
          {
            "node": "Content extraction",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Wait2",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait": {
      "main": [
        [
          {
            "node": "Status Verification",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait2": {
      "main": [
        [
          {
            "node": "Status Verification",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Upload_doc": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "On form submission": {
      "main": [
        [
          {
            "node": "Upload_doc",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Status Verification": {
      "main": [
        [
          {
            "node": "If",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow automatically converts PDF documents to Markdown format using the LlamaIndex Cloud API. LlamaIndex is a powerful data framework that specializes in connecting large language models with external data sources, offering advanced document processing capabilities with…

Source: https://n8n.io/workflows/7164/ — original creator credit. Request a take-down →

More Web Scraping workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Web Scraping

This workflow allows you to import any workflow from a file or another n8n instance and map the credentials easily. A multi-form setup guides you through the entire process At the beginning you have t

Execute Command, Read Write File, HTTP Request +3
Web Scraping

[n8n] Advanced URL Parsing and Shortening Workflow - Switchy.io Integration. Uses splitInBatches, stickyNote, httpRequest, html. Event-driven trigger; 56 nodes.

HTTP Request, GitHub, Stop And Error +1
Web Scraping

[](https://youtu.be/c7yCZhmMjtI)

HTTP Request, GitHub, Stop And Error +1
Web Scraping

N8n recently introduced folders and it has been a big improvement on workflow management on top of the tags.

HTTP Request, n8n, Form Trigger +1
Web Scraping

This workflow automates the creation of press releases for music artists releasing a new single. Upload your MP3, fill in basic info, and receive a publication-ready press release saved as a Google Do

Form Trigger, HTTP Request, Google Docs