{
  "id": "vbQQcRqfFKBOs4ug",
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "name": "PDF Parse Using LlamaCloud",
  "tags": [],
  "nodes": [
    {
      "id": "5bd2a596-f055-46aa-ae21-742d9e57ca79",
      "name": "Workflow Info & Setup",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -3472,
        192
      ],
      "parameters": {
        "color": 4,
        "width": 560,
        "height": 1328,
        "content": "## \ud83d\udcc4 PDF Parse with LlamaCloud\n\n### How it works\n\nThis workflow extracts and converts PDF content into clean markdown format using LlamaCloud's parsing API:\n\n1. **Download PDF** \u2013 Retrieves a PDF file from Google Drive\n2. **Upload to LlamaCloud** \u2013 Sends the PDF to LlamaCloud's parsing service and receives a job ID\n3. **Wait & Poll Status** \u2013 Waits 1 second, then checks if parsing is complete\n4. **Loop Until Complete** \u2013 If still processing, waits 30 seconds and checks again\n5. **Retrieve Markdown** \u2013 Once complete, fetches the parsed content in markdown format\n\n### Key Features\n- Handles complex PDFs with tables, images, and multi-column layouts\n- Returns clean, structured markdown output\n- Automatic retry logic for long processing times\n- Ready for AI processing or content transformation\n\n---\n\n### Set Up Steps (~5 minutes)\n\n**1. Get LlamaCloud API Key**\n- Go to [cloud.llamaindex.ai](https://cloud.llamaindex.ai)\n- Sign up or log in to your account\n- Navigate to **API Keys** section\n- Create a new API key and copy it\n\n**2. Configure LlamaCloud Credentials in n8n**\n- In n8n, create a **Generic Header Auth** credential\n- Set the credential name (e.g., \"LlamaCloud API\")\n- Configure:\n  - **Name**: `Authorization`\n  - **Value**: `Bearer YOUR_TOKEN_HERE`\n- Apply this credential to all HTTP Request nodes that call LlamaCloud\n\n**3. Set Up Google Drive (Optional)**\n- If using the Google Drive node:\n  - Create OAuth2 credentials in n8n\n  - Connect your Google account\n  - Update the File ID to point to your PDF\n- **Alternative**: Replace the Google Drive node with any file input method\n\n**4. Test the Workflow**\n- Click \"Execute Workflow\" to test\n- The parsed markdown will appear in the final \"Get Data\" node\n- Processing time varies based on PDF complexity (typically 30-60 seconds)\n\n---\n\n### Notes\n- Large or complex PDFs may take 1-2 minutes to process\n- The workflow automatically retries every 30 seconds until complete\n- Output is in markdown format, perfect for AI processing\n- You can connect AI nodes after \"Get Data\" to analyze or transform the content"
      },
      "typeVersion": 1
    },
    {
      "id": "8697a2ba-d399-464c-8542-695dacf74aba",
      "name": "Step 1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2896,
        336
      ],
      "parameters": {
        "color": 5,
        "width": 320,
        "height": 280,
        "content": "## Step 1: Source PDF\n\nDownload your PDF from Google Drive.\n\n**Alternative sources:**\n- HTTP Request node (download from URL)\n- Binary File node (local upload)\n- Webhook (receive via API)\n\nThe PDF is passed as binary data to the next step."
      },
      "typeVersion": 1
    },
    {
      "id": "f70b7b43-8b78-435f-bba4-688fd1b099d2",
      "name": "Step 2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2560,
        304
      ],
      "parameters": {
        "color": 6,
        "width": 320,
        "height": 280,
        "content": "## Step 2: Upload to LlamaCloud\n\nSends the PDF binary data to LlamaCloud's parsing API.\n\n**Returns:** Job ID for tracking\n\n**Credential needed:** Bearer YOUR_TOKEN_HERE with your LlamaCloud API key"
      },
      "typeVersion": 1
    },
    {
      "id": "b7c3a9eb-7e0d-4bfb-a2ea-9b6388c742a6",
      "name": "Step 3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -2192,
        288
      ],
      "parameters": {
        "color": 7,
        "width": 480,
        "height": 280,
        "content": "## Step 3: Wait & Check\n\nWaits 1 second, then checks the parsing job status.\n\nIf **SUCCESS** \u2192 retrieves markdown\nIf **PENDING** \u2192 waits 30s and checks again\n\nThis loop continues until parsing completes."
      },
      "typeVersion": 1
    },
    {
      "id": "4e9a7d2c-16b0-4d80-b824-b57c6a2a9caa",
      "name": "Step 4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1648,
        176
      ],
      "parameters": {
        "color": 3,
        "width": 320,
        "height": 344,
        "content": "## Step 4: Get Markdown\n\nOnce parsing is complete, this retrieves the final markdown output.\n\n**Output:** Clean markdown text with:\n- Extracted text content\n- Table structures preserved\n- Image references\n- Proper formatting\n\nReady for AI analysis or further processing!"
      },
      "typeVersion": 1
    },
    {
      "id": "311c4821-1754-4bca-907a-90af8bdfdc31",
      "name": "Wait1",
      "type": "n8n-nodes-base.wait",
      "position": [
        -2288,
        672
      ],
      "parameters": {
        "amount": 1
      },
      "typeVersion": 1.1
    },
    {
      "id": "3323aec4-86f8-4d83-8a15-01cbecf65824",
      "name": "Check Status1",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -2064,
        672
      ],
      "parameters": {
        "url": "=https://api.cloud.llamaindex.ai/api/v1/parsing/job/{{ $json.id }}",
        "options": {},
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "b104278d-c061-4bbb-a787-8f10cbeecac8",
      "name": "Send Data To Llama Cloud1",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -2512,
        672
      ],
      "parameters": {
        "url": "https://api.cloud.llamaindex.ai/api/v1/parsing/upload",
        "method": "POST",
        "options": {},
        "sendBody": true,
        "contentType": "multipart-form-data",
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "bodyParameters": {
          "parameters": [
            {
              "name": "file",
              "parameterType": "formBinaryData",
              "inputDataFieldName": "data"
            }
          ]
        },
        "genericAuthType": "httpHeaderAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    },
    {
      "id": "e7daf635-0e3c-4003-9c67-d193e89a10f3",
      "name": "Download File From Drive1",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        -2736,
        672
      ],
      "parameters": {
        "fileId": {
          "__rl": true,
          "mode": "list",
          "value": "1TPAxQq0fXVrgr7VbP7_RVV33v6r0sYk9",
          "cachedResultUrl": "https://drive.google.com/file/d/1TPAxQq0fXVrgr7VbP7_RVV33v6r0sYk9/view?usp=drivesdk",
          "cachedResultName": "lama_parse_example.pdf"
        },
        "options": {},
        "operation": "download"
      },
      "credentials": {
        "googleDriveOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 3
    },
    {
      "id": "f4b50e47-7c77-4b65-8365-5449e3997552",
      "name": "Check Job Status1",
      "type": "n8n-nodes-base.if",
      "position": [
        -1840,
        608
      ],
      "parameters": {
        "options": {},
        "conditions": {
          "options": {
            "version": 2,
            "leftValue": "",
            "caseSensitive": true,
            "typeValidation": "strict"
          },
          "combinator": "and",
          "conditions": [
            {
              "id": "fd51e991-57cc-490b-b0cb-cc4d3d9de54d",
              "operator": {
                "name": "filter.operator.equals",
                "type": "string",
                "operation": "equals"
              },
              "leftValue": "={{ $json.status }}",
              "rightValue": "SUCCESS"
            }
          ]
        }
      },
      "typeVersion": 2.2
    },
    {
      "id": "f2a852cc-36d7-4c9a-bad8-67d776b16cb3",
      "name": "Wait3",
      "type": "n8n-nodes-base.wait",
      "position": [
        -1616,
        768
      ],
      "parameters": {
        "amount": 30
      },
      "typeVersion": 1.1
    },
    {
      "id": "a4ef35f2-c1db-4c7e-8146-d89fd026f3c8",
      "name": "Get Data1",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -1616,
        560
      ],
      "parameters": {
        "url": "=https://api.cloud.llamaindex.ai/api/v1/parsing/job/{{ $json.id }}/result/markdown",
        "options": {},
        "sendHeaders": true,
        "authentication": "genericCredentialType",
        "genericAuthType": "httpHeaderAuth",
        "headerParameters": {
          "parameters": [
            {
              "name": "accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {
        "httpHeaderAuth": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.2
    }
  ],
  "active": false,
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "4598139a-5320-413d-981c-22b93d9982cc",
  "connections": {
    "Wait1": {
      "main": [
        [
          {
            "node": "Check Status1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait3": {
      "main": [
        [
          {
            "node": "Check Status1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Status1": {
      "main": [
        [
          {
            "node": "Check Job Status1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Job Status1": {
      "main": [
        [
          {
            "node": "Get Data1",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Wait3",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download File From Drive1": {
      "main": [
        [
          {
            "node": "Send Data To Llama Cloud1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send Data To Llama Cloud1": {
      "main": [
        [
          {
            "node": "Wait1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}