AutomationFlowsData & Sheets › Scrape Idealista Real Estate Property Listings with Scrapegraph AI

Scrape Idealista Real Estate Property Listings with Scrapegraph AI

Original n8n title: Scrape Idealista 🏠 Real Estate Property Listings with Scrapegraph AI 🕷️

ByDavide Boizza @n3witalia on n8n.io

This workflow automates the process of scraping real estate listings from *Idealista (or similar property portals), extracting structured property data using AI, and storing the results directly into Google Sheets.

Event trigger★★★★☆ complexity20 nodesN8N Nodes ScrapegraphaiGoogle Sheets
Data & Sheets Trigger: Event Nodes: 20 Complexity: ★★★★☆ Added:

This workflow corresponds to n8n.io template #15510 — we link there as the canonical source.

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "id": "TaomO9wuriacMSOO",
  "name": "Idealista Real Estate Listing Scraper",
  "tags": [],
  "nodes": [
    {
      "id": "cf17e9ac-aa94-45e0-8ce0-44b13cc6931a",
      "name": "When clicking \u2018Execute workflow\u2019",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -1104,
        1344
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "baa3c744-cecd-439b-894b-f34539faaf75",
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        -64,
        1344
      ],
      "parameters": {
        "options": {
          "reset": false
        }
      },
      "typeVersion": 3
    },
    {
      "id": "74377b43-5de0-419e-a0de-4168e11078de",
      "name": "Split Out",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        -432,
        1344
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "generated_urls"
      },
      "typeVersion": 1
    },
    {
      "id": "447e8948-d0bc-4168-9d32-3c94c6f86a15",
      "name": "Aggregate",
      "type": "n8n-nodes-base.aggregate",
      "position": [
        336,
        1040
      ],
      "parameters": {
        "options": {},
        "fieldsToAggregate": {
          "fieldToAggregate": [
            {
              "fieldToAggregate": "urls"
            }
          ]
        }
      },
      "typeVersion": 1
    },
    {
      "id": "fe22e2e4-7148-4149-a30d-b8017d9b77ef",
      "name": "Split Out1",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        1008,
        1040
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "unified_urls"
      },
      "typeVersion": 1
    },
    {
      "id": "8f2e2f93-3fcc-467d-b267-67b10c8768ea",
      "name": "Limit",
      "type": "n8n-nodes-base.limit",
      "disabled": true,
      "position": [
        1264,
        1040
      ],
      "parameters": {
        "maxItems": 3
      },
      "typeVersion": 1
    },
    {
      "id": "366ac608-9b4e-49fa-81d4-8914bba4cecf",
      "name": "Loop Over Items1",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        1552,
        1024
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 3
    },
    {
      "id": "2686ab7c-2522-4d12-8fb1-9e7509042011",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        992,
        928
      ],
      "parameters": {
        "color": 7,
        "width": 1488,
        "height": 384,
        "content": "## STEP 4 - extracts detailed property\nClone [this Sheet](https://docs.google.com/spreadsheets/d/1jtMyMglBbekD9Z407q8-0vn-cDDXhM81Uj1oAZIJGX8/edit?usp=sharing). Then processes each listing URL through another **ScrapegraphAI** node, which extracts detailed property data (title, description, price, area, bedrooms, bathrooms, floor, rooms, balcony, terrace, cellar, heating, air conditioning, image URLs) based on a JSON schema. "
      },
      "typeVersion": 1
    },
    {
      "id": "d1cefccf-8d85-40ca-a950-8fb5b80ce9e3",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -976,
        1152
      ],
      "parameters": {
        "color": 7,
        "width": 496,
        "height": 464,
        "content": "## STEP 1 - Config Params\nEnter the GET pagination parameter with theri filters. For example, for idealista_it, if the paginated URL is:\n\n`https://www.idealista.it/vendita-case/verona-verona/\n\n"
      },
      "typeVersion": 1
    },
    {
      "id": "2dea7a8b-94fb-429d-b501-debd117bd43e",
      "name": "Set params",
      "type": "n8n-nodes-base.set",
      "position": [
        -880,
        1344
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "d8af8b6b-d121-4887-a739-f6dbcd871802",
              "name": "url",
              "type": "string",
              "value": "https://www.idealista.it/vendita-case/verona-verona/con-prezzo_500000,prezzo-min_200000,dimensione_80,quadrilocali-4,5-locali-o-piu,giardino,nuova-costruzione,buono-stato/"
            },
            {
              "id": "1890a580-64ce-4530-96a1-72c5a7142672",
              "name": "max_pages",
              "type": "string",
              "value": "2"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "36c4ebf7-f86d-4ce1-b86a-eae8d326eb18",
      "name": "Generate Urls",
      "type": "n8n-nodes-base.code",
      "position": [
        -656,
        1344
      ],
      "parameters": {
        "jsCode": "for (const item of $input.all()) {\n  \n  const baseUrl = item.json.url;\n  const maxPages = parseInt(item.json.max_pages, 10);\n  const pageFormatValue=item.json.page_format_value;\n  \n  const urls = [];\n\n  for (let i = 1; i <= maxPages; i++) {\n    urls.push(`${baseUrl}lista-${i}.htm`);\n  }\n\n  item.json.generated_urls = urls;\n}\n\nreturn $input.all();\n"
      },
      "typeVersion": 2
    },
    {
      "id": "002bd0e0-842a-4c81-8955-c1de907cf48f",
      "name": "Scrape listings",
      "type": "n8n-nodes-scrapegraphai.scrapegraphAi",
      "position": [
        272,
        1648
      ],
      "parameters": {
        "formats": {
          "format": [
            {}
          ]
        },
        "fetchConfig": {}
      },
      "credentials": {
        "scrapegraphAIApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "025c0e03-68af-474a-8160-c5b4b3dc60b2",
      "name": "Unified",
      "type": "n8n-nodes-base.code",
      "position": [
        752,
        1040
      ],
      "parameters": {
        "jsCode": "const items = $input.all();\n\nconst unified = items\n  .flatMap(item => item.json.urls || [])\n  .flat();\n\nreturn [\n  {\n    json: {\n      unified_urls: unified\n    }\n  }\n];\n"
      },
      "typeVersion": 2
    },
    {
      "id": "e6c41c49-8ed5-4100-891f-0017ddbcdcf2",
      "name": "Extract data",
      "type": "n8n-nodes-scrapegraphai.scrapegraphAi",
      "position": [
        1904,
        1040
      ],
      "parameters": {
        "formats": {
          "format": [
            {}
          ]
        },
        "fetchConfig": {}
      },
      "credentials": {
        "scrapegraphAIApi": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "ea0ad648-fc85-4e44-b61b-1f14bf2a1042",
      "name": "Update real estate listings",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        2256,
        1104
      ],
      "parameters": {
        "columns": {
          "value": {
            "URL": "={{ $json.website_url }}",
            "AREA": "={{ $json.result.items[0].area }}",
            "FLOOR": "={{ $json.result.items[0].floor }}",
            "PROCE": "={{ $json.result.items[0].price }}",
            "ROOMS": "={{ $json.result.items[0].rooms }}",
            "TITLE": "={{ $json.result.items[0].title }}",
            "CELLAR": "={{ $json.result.items[0].cellar }}",
            "BALCONY": "={{ $json.result.items[0].balcony }}",
            "HEATING": "={{ $json.result.items[0].heating }}",
            "BEDROOMS": "={{ $json.result.items[0].bedrooms }}",
            "TERRANCE": "={{ $json.result.items[0].terrace }}",
            "BATHROOMS": "={{ $json.result.items[0].bathrooms }}",
            "REFERENCE": "={{ $json.result.items[0].reference }}",
            "IMAGE URLS": "={{ JSON.stringify($json.result.items[0].image_urls) }}",
            "DESCRIPTION": "={{ $json.result.items[0].description }}",
            "AIR_CONDITIONING": "={{ $json.result.items[0].air_conditioning }}"
          },
          "schema": [
            {
              "id": "URL",
              "type": "string",
              "display": true,
              "removed": false,
              "required": false,
              "displayName": "URL",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "TITLE",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "TITLE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "DESCRIPTION",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "DESCRIPTION",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "REFERENCE",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "REFERENCE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "PROCE",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "PROCE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "AREA",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "AREA",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "BEDROOMS",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "BEDROOMS",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "BATHROOMS",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "BATHROOMS",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "FLOOR",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "FLOOR",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "ROOMS",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "ROOMS",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "BALCONY",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "BALCONY",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "TERRANCE",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "TERRANCE",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "CELLAR",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "CELLAR",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "HEATING",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "HEATING",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "AIR_CONDITIONING",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "AIR_CONDITIONING",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            },
            {
              "id": "IMAGE URLS",
              "type": "string",
              "display": true,
              "required": false,
              "displayName": "IMAGE URLS",
              "defaultMatch": false,
              "canBeUsedToMatch": true
            }
          ],
          "mappingMode": "defineBelow",
          "matchingColumns": [
            "URL"
          ],
          "attemptToConvertTypes": false,
          "convertFieldsToString": false
        },
        "options": {},
        "operation": "appendOrUpdate",
        "sheetName": {
          "__rl": true,
          "mode": "list",
          "value": "gid=0",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1jtMyMglBbekD9Z407q8-0vn-cDDXhM81Uj1oAZIJGX8/edit#gid=0",
          "cachedResultName": "Foglio1"
        },
        "documentId": {
          "__rl": true,
          "mode": "list",
          "value": "1jtMyMglBbekD9Z407q8-0vn-cDDXhM81Uj1oAZIJGX8",
          "cachedResultUrl": "https://docs.google.com/spreadsheets/d/1jtMyMglBbekD9Z407q8-0vn-cDDXhM81Uj1oAZIJGX8/edit?usp=drivesdk",
          "cachedResultName": "Real Estate listing"
        }
      },
      "credentials": {
        "googleSheetsOAuth2Api": {
          "name": "<your credential>"
        }
      },
      "typeVersion": 4.7
    },
    {
      "id": "a7a0cf53-eee4-43ec-ac95-7d311956af05",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        16,
        256
      ],
      "parameters": {
        "width": 944,
        "height": 624,
        "content": "# Automate Idealista Real Estate Listing Property Scraper with ScrapeGraph AI and Google Sheets\nThis workflow automates the process of **scraping real estate property listings** from websites using **ScrapeGraph AI**, extracting structured data, and saving it to a **Google Sheet**. It is designed to handle paginated listing pages and can be adapted to any real estate site that uses URL parameters for pagination.\n\nNOTE:\nThis workflow has been tested with Idealista real estate website in Italy. However, it is designed to be adaptable by modifying the pagination parameter and the listing URL pattern, you can use it with **any real estate website** that structures its listings with URL-based pagination.\n\n### **How it works:**\n\nThe workflow operates in two structured phases: **listing URL discovery** and **data extraction & storage**. First, a Code node generates paginated listing URLs using a base URL, maximum page count, and pagination parameter. Each page is processed by **ScrapeGraphAI** to extract individual property URLs, which are validated and structured using a Google Gemini-powered Information Extractor. A Wait node controls request pacing, and looping ensures all pages are processed safely.\n\nIn the second phase, collected listing URLs are aggregated and iterated individually. ScrapeGraphAI extracts structured property data (price, area, rooms, features, images, etc.) according to a defined JSON schema. The results are written to **Google Sheets**, where records are deduplicated based on listing URL. The modular design enables scalability, schema customization, and storage replacement.\n\n### **Setup steps:**\n\nStart by importing the workflow into n8n and configuring required credentials: **ScrapeGraphAI API**, **Google Gemini API**, and **Google Sheets OAuth2**. Prepare a Google Sheet (or clone the template), then note the **Document ID** and **Sheet Name** for configuration.\n\nIn the **Set params** node, define the base listing URL and number of pages to scrape. If needed, update the listing URL pattern in the **Extract individual URL** node and adjust the JSON schema in the **Extract data** node to match your target fields. Finally, configure the Google Sheets node with correct column mappings, activate the workflow, and execute it to begin automated scraping and structured data collection.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "731db16d-3d66-4da4-96d1-4a72fecb85dd",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        208,
        1536
      ],
      "parameters": {
        "color": 7,
        "width": 704,
        "height": 384,
        "content": "## STEP 2 - Extract Urls\n\nAll collected listing URLs are aggregated and split into individual items.\n"
      },
      "typeVersion": 1
    },
    {
      "id": "fcfe4d66-b1f1-4f3f-8d03-e33d3f507a04",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        16,
        928
      ],
      "parameters": {
        "color": 7,
        "width": 944,
        "height": 384,
        "content": "## STEP 3 - Extract Urls\n\nAll collected listing URLs are aggregated and split into individual items\n"
      },
      "typeVersion": 1
    },
    {
      "id": "27acd568-2fbe-4173-ab65-3fd1a31c3b9c",
      "name": "Edit Fields",
      "type": "n8n-nodes-base.set",
      "position": [
        688,
        1648
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "6a231f21-9dfd-495b-8f35-72ec6ce63839",
              "name": "urls",
              "type": "array",
              "value": "={{ $json.result.listing_urls }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "909a1fe3-63d1-44d6-9325-233e7290575b",
      "name": "Sticky Note8",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1008,
        144
      ],
      "parameters": {
        "color": 7,
        "width": 736,
        "height": 736,
        "content": "## MY NEW YOUTUBE CHANNEL\n\ud83d\udc49 [Subscribe to my new **YouTube channel**](https://youtube.com/@n3witalia). Here I\u2019ll share videos and Shorts with practical tutorials and **FREE templates for n8n**.\n\n[![image](https://n3wstorage.b-cdn.net/n3witalia/youtube-n8n-cover.jpg)](https://youtube.com/@n3witalia)"
      },
      "typeVersion": 1
    }
  ],
  "active": false,
  "settings": {
    "binaryMode": "separate",
    "executionOrder": "v1"
  },
  "versionId": "b5aba440-74f3-4d04-9432-e79a1d8e81a3",
  "connections": {
    "Limit": {
      "main": [
        [
          {
            "node": "Loop Over Items1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Unified": {
      "main": [
        [
          {
            "node": "Split Out1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate": {
      "main": [
        [
          {
            "node": "Unified",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set params": {
      "main": [
        [
          {
            "node": "Generate Urls",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out1": {
      "main": [
        [
          {
            "node": "Limit",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Edit Fields": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract data": {
      "main": [
        [
          {
            "node": "Update real estate listings",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Generate Urls": {
      "main": [
        [
          {
            "node": "Split Out",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items": {
      "main": [
        [
          {
            "node": "Aggregate",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Scrape listings",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Scrape listings": {
      "main": [
        [
          {
            "node": "Edit Fields",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items1": {
      "main": [
        [],
        [
          {
            "node": "Extract data",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update real estate listings": {
      "main": [
        [
          {
            "node": "Loop Over Items1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking \u2018Execute workflow\u2019": {
      "main": [
        [
          {
            "node": "Set params",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

Credentials you'll need

Each integration node will prompt for credentials when you import. We strip credential IDs before publishing — you'll add your own.

Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

This workflow automates the process of scraping real estate listings from *Idealista (or similar property portals), extracting structured property data using AI, and storing the results directly into Google Sheets.

Source: https://n8n.io/workflows/15510/ — original creator credit. Request a take-down →

More Data & Sheets workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Data & Sheets

This template is ideal for solo store owners, eCommerce marketers, automation beginners, or anyone using Shopify and Gmail who wants to recover lost revenue without coding.

HTTP Request, Gmail, Twilio +3
Data & Sheets

PCN. Uses googleSheets, httpRequest, @n-octo-n/n8n-nodes-json-database, itemLists. Event-driven trigger; 60 nodes.

Google Sheets, HTTP Request, @N Octo N/N8N Nodes Json Database +3
Data & Sheets

The workflow automates the process of gathering extensive keyword data for a "Main Keyword." It starts by reading initial parameters from a Google Sheets template, creates a new dedicated Google Sheet

Google Sheets, Google Drive, HTTP Request
Data & Sheets

🔥 March Sale – n8n Community Members Get ideoGener8r for Just $27! (Reg. $47) Use Coupon Code: (Valid until 3/31/2025 for n8n community members)

HTTP Request, Google Drive, Google Sheets
Data & Sheets

📄 Documentation: Notion Guide

Google Sheets, Google Drive, HTTP Request +2