AutomationFlowsAI & RAG › Extract Structured Data From D&b Company Reports with Gpt-4o

Extract Structured Data From D&b Company Reports with Gpt-4o

ByRobert Breen @rbreen on n8n.io

Pull a Dun & Bradstreet Business Information Report (PDF) by DUNS, convert the response into a binary PDF file, extract readable text, and use OpenAI to return a clean, flat JSON with only the key fields you care about (e.g., report date, Paydex, viability score, credit limit).…

Manual trigger★★★☆☆ complexityAI-powered11 nodesOpenAI ChatHTTP RequestAgentOutput Parser Structured
AI & RAG Trigger: Manual Nodes: 11 Complexity: ★★★☆☆ AI nodes: yes Added:

This workflow corresponds to n8n.io template #8868 — we link there as the canonical source.

This workflow follows the Agent → HTTP Request recipe pattern — see all workflows that pair these two integrations.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "meta": {
    "templateCredsSetupCompleted": true
  },
  "nodes": [
    {
      "id": "c0768748-9099-4bb4-8d23-d4ceb1c404b7",
      "name": "Sticky Note10",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1872,
        8464
      ],
      "parameters": {
        "width": 400,
        "height": 1312,
        "content": "## Setup Instructions\n\n\n### \ud83c\udfe2 Fetch Company Data from D&B (Data Blocks)\n\n1. Add a new **HTTP Request** node and name it `Data Blocks`  \n2. Configure it as follows:  \n   - **Authentication:** None (token is passed in headers)  \n   - **Method:** `GET`  \n   - **URL:**  \n     ```\n     https://plus.dnb.com/v1/data/duns/{{ $json.duns }}?blockIDs=paymentinsight_L4_v1&tradeUp=hq&customerReference=customer%20reference%20text&orderReason=6332\n     ```  \n     > This dynamically uses the `duns` value from input JSON.  \n3. Under **Headers**, add:  \n   - `Accept = application/json`  \n   - `Authorization = Bearer {{$json[\"access_token\"]}}`  \n     > Replace the hardcoded token with the dynamic token output from your **Get Bearer YOUR_TOKEN_HERE** node.  \n4. Execute the node \u2014 you\u2019ll receive structured company data from D&B\u2019s **Data Blocks API**.  \n\n\u2705 You can now pass the response to other nodes (e.g., Google Sheets, databases, or CRMs).  \n\n\n---\n\n### \ud83d\udd11 Set up D&B Auth HTTP Request node\n\n1. Add a new **HTTP Request** node in your workflow  \n2. Configure it as follows:  \n   - **Authentication:** Basic Auth (use your D&B **username** and **password**)  \n   - **Method:** `POST`  \n   - **URL:** `https://plus.dnb.com/v3/token`  \n3. Under **Body Parameters**, add:  \n   - `grant_type = client_credentials`  \n4. Under **Headers**, add:  \n   - `Accept = application/json`  \n5. Execute the node \u2014 the response will include an **access_token**  \n6. Use this token in downstream requests with:  \n   - `Authorization: Bearer {{$json[\"access_token\"]}}`  \n\n\n## \ud83d\udcec Contact\n\nNeed help customizing this (e.g., routing the PDF to Drive, mapping JSON to your CRM, or expanding the schema)?\n\n\ud83d\udce7 robert@ynteractive.com  \n\ud83d\udd17 https://www.linkedin.com/in/robert-breen-29429625/  \n\ud83c\udf10 https://ynteractive.com"
      },
      "typeVersion": 1
    },
    {
      "id": "0bb07b06-7c44-47d7-b634-0b671bbd4f05",
      "name": "OpenAI Chat Model6",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        128,
        9216
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o",
          "cachedResultName": "gpt-4o"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "0dc8ad32-c434-4c56-9f97-8d6caf59a26f",
      "name": "OpenAI Chat Model7",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenAi",
      "position": [
        384,
        9312
      ],
      "parameters": {
        "model": {
          "__rl": true,
          "mode": "list",
          "value": "gpt-4o",
          "cachedResultName": "gpt-4o"
        },
        "options": {}
      },
      "credentials": {
        "openAiApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "2a704e6a-4996-488b-831e-b757270b6518",
      "name": "Sticky Note65",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1120,
        8864
      ],
      "parameters": {
        "color": 3,
        "width": 352,
        "height": 768,
        "content": "### \ud83c\udfe2 Fetch Company Data from D&B (Data Blocks)\n\n1. Add a new **HTTP Request** node and name it `Data Blocks`  \n2. Configure it as follows:  \n   - **Authentication:** None (token is passed in headers)  \n   - **Method:** `GET`  \n   - **URL:**  \n     ```\n     https://plus.dnb.com/v1/data/duns/{{ $json.duns }}?blockIDs=paymentinsight_L4_v1&tradeUp=hq&customerReference=customer%20reference%20text&orderReason=6332\n     ```  \n     > This dynamically uses the `duns` value from input JSON.  \n3. Under **Headers**, add:  \n   - `Accept = application/json`  \n   - `Authorization = Bearer {{$json[\"access_token\"]}}`  \n     > Replace the hardcoded token with the dynamic token output from your **Get Bearer YOUR_TOKEN_HERE** node.  \n4. Execute the node \u2014 you\u2019ll receive structured company data from D&B\u2019s **Data Blocks API**.  \n\n\u2705 You can now pass the response to other nodes (e.g., Google Sheets, databases, or CRMs).  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "a08d013f-f02c-4d97-b8a8-b39892f8f23d",
      "name": "Sticky Note66",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -544,
        8640
      ],
      "parameters": {
        "color": 3,
        "width": 368,
        "height": 560,
        "content": "### \ud83d\udd11 Set up D&B Auth HTTP Request node\n\n1. Add a new **HTTP Request** node in your workflow  \n2. Configure it as follows:  \n   - **Authentication:** Basic Auth (use your D&B **username** and **password**)  \n   - **Method:** `POST`  \n   - **URL:** `https://plus.dnb.com/v3/token`  \n3. Under **Body Parameters**, add:  \n   - `grant_type = client_credentials`  \n4. Under **Headers**, add:  \n   - `Accept = application/json`  \n5. Execute the node \u2014 the response will include an **access_token**  \n6. Use this token in downstream requests with:  \n   - `Authorization: Bearer {{$json[\"access_token\"]}}`  \n"
      },
      "typeVersion": 1
    },
    {
      "id": "565c3c60-61e4-4b82-81ae-d532377a2c80",
      "name": "D&B Report",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -976,
        9520
      ],
      "parameters": {
        "url": "https://plus.dnb.com/v1/reports/duns/804735132?productId=birstd&inLanguage=en-US&reportFormat=PDF&orderReason=6332&tradeUp=hq&customerReference=customer%20reference%20text",
        "options": {},
        "authentication": "headerAuth",
        "headerParametersUi": {
          "parameter": [
            {
              "name": "Accept",
              "value": "application/json"
            }
          ]
        }
      },
      "credentials": {},
      "typeVersion": 1
    },
    {
      "id": "e6d764bd-3b2d-40d3-949e-8aa39c220668",
      "name": "Convert to PDF File",
      "type": "n8n-nodes-base.convertToFile",
      "position": [
        -544,
        9392
      ],
      "parameters": {
        "options": {},
        "operation": "toBinary",
        "sourceProperty": "contents[0].contentObject"
      },
      "typeVersion": 1.1
    },
    {
      "id": "eaf944cd-eda8-4b39-80c5-66378e92dddc",
      "name": "Extract Binary",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        -208,
        9392
      ],
      "parameters": {
        "options": {},
        "operation": "pdf"
      },
      "typeVersion": 1
    },
    {
      "id": "7c82577c-cb5f-47df-9f29-7646107276e8",
      "name": "Analyze PDF",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        224,
        8848
      ],
      "parameters": {
        "text": "={{ $json.text }}",
        "options": {
          "systemMessage": "You are a precision extractor. Read the provided business report PDF and return only a single flat JSON object with the fields below. Keep it minimal and focused on overall scores.\n\nNo arrays/lists.\n\nNo prose.\n\nIf a value is missing, output null.\n\nDates must be YYYY-MM-DD.\n\nNumbers must be plain numerics (no commas or $).\n\nOutput Format\n\nReturn only in JSON object:\n\n\nRules\n\nPrefer the most recent or highest-level \u201coverall\u201d values if multiple are shown.\n\nNever include arrays, nested structures, or text outside of the JSON object.",
          "passthroughBinaryImages": true
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 2.2
    },
    {
      "id": "0bf7ddc4-910a-4c15-8265-c959ddf16e7a",
      "name": "Structured Output",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        368,
        9120
      ],
      "parameters": {
        "autoFix": true,
        "jsonSchemaExample": "{\n  \"report_date\": \"\",\n  \"company_name\": \"\",\n  \"duns\": \"\",\n  \"dnb_rating_overall\": \"\",\n  \"composite_credit_appraisal\": \"\",\n  \"viability_score\": \"\",\n  \"portfolio_comparison_score\": \"\",\n  \"paydex_3mo\": \"\",\n  \"paydex_24mo\": \"\",\n  \"credit_limit_conservative\": \"\"\n}\n"
      },
      "typeVersion": 1.3
    },
    {
      "id": "f8126ea2-1265-45fc-adf8-2e507a23df14",
      "name": "Get Token",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -400,
        9072
      ],
      "parameters": {
        "url": "https://plus.dnb.com/v3/token",
        "options": {},
        "requestMethod": "POST",
        "authentication": "basicAuth",
        "bodyParametersUi": {
          "parameter": [
            {
              "name": "grant_type",
              "value": "client_credentials"
            }
          ]
        },
        "headerParametersUi": {
          "parameter": [
            {
              "name": "Content-Type",
              "value": "application/x-www-form-urlencoded"
            }
          ]
        }
      },
      "credentials": {},
      "typeVersion": 1
    }
  ],
  "connections": {
    "D&B Report": {
      "main": [
        [
          {
            "node": "Convert to PDF File",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract Binary": {
      "main": [
        [
          {
            "node": "Analyze PDF",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output": {
      "ai_outputParser": [
        [
          {
            "node": "Analyze PDF",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model6": {
      "ai_languageModel": [
        [
          {
            "node": "Analyze PDF",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "OpenAI Chat Model7": {
      "ai_languageModel": [
        [
          {
            "node": "Structured Output",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Convert to PDF File": {
      "main": [
        [
          {
            "node": "Extract Binary",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

Pull a Dun & Bradstreet Business Information Report (PDF) by DUNS, convert the response into a binary PDF file, extract readable text, and use OpenAI to return a clean, flat JSON with only the key fields you care about (e.g., report date, Paydex, viability score, credit limit).…

Source: https://n8n.io/workflows/8868/ — original creator credit. Request a take-down →

More AI & RAG workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

AI & RAG

LEAD RESEARCH ENRICHMENT. Uses stickyNote, toolSerpApi, lmChatOpenAi, outputParserStructured. Manual trigger; 19 nodes.

Tool Serp Api, OpenAI Chat, Output Parser Structured +3
AI & RAG

Send one WhatsApp message → Get AI-optimized content across 7+ social platforms.

Output Parser Structured, Google Gemini Chat, OpenAI Chat +8
AI & RAG

Enrich Property Inventory Survey With Image Recognition And Ai Agent. Uses manualTrigger, lmChatOpenAi, airtable, executeWorkflowTrigger. Event-driven trigger; 29 nodes.

OpenAI Chat, Airtable, Execute Workflow Trigger +5
AI & RAG

Manual Http. Uses manualTrigger, lmChatOpenAi, airtable, executeWorkflowTrigger. Event-driven trigger; 29 nodes.

OpenAI Chat, Airtable, Execute Workflow Trigger +5
AI & RAG

This template is ideal for content creators, social media managers, YouTubers, and digital marketers who want to generate high-quality videos daily using AI and distribute them effortlessly across mul

Google Sheets, HTTP Request, Agent +3