AutomationFlowsData & Sheets › Scrape Listings to CSV Every 6 Hours

Scrape Listings to CSV Every 6 Hours

Original n8n title: 6ixo - Scrape Listings to CSV (no Code Editing)

6ixo - Scrape Listings to CSV (No Code Editing). Event-driven trigger; 4 nodes.

Event trigger★★★★☆ complexity4 nodes
Data & Sheets Trigger: Event Nodes: 4 Complexity: ★★★★☆ Added:

The workflow JSON

Copy or download the full n8n JSON below. Paste it into a new n8n workflow, add your credentials, activate. Full import guide →

Download .json
{
  "name": "6ixo - Scrape Listings to CSV (No Code Editing)",
  "nodes": [
    {
      "parameters": {},
      "id": "manual-trigger",
      "name": "Manual Trigger",
      "type": "n8n-nodes-base.manualTrigger",
      "typeVersion": 1,
      "position": [
        160,
        220
      ]
    },
    {
      "parameters": {
        "rule": {
          "interval": [
            {
              "field": "hours",
              "hoursInterval": 6
            }
          ]
        }
      },
      "id": "schedule-trigger",
      "name": "Every 6 Hours",
      "type": "n8n-nodes-base.scheduleTrigger",
      "typeVersion": 1.2,
      "position": [
        160,
        420
      ]
    },
    {
      "parameters": {
        "assignments": {
          "assignments": [
            {
              "id": "githubToken",
              "name": "githubToken",
              "value": "PASTE_GITHUB_TOKEN_HERE",
              "type": "string"
            },
            {
              "id": "githubOwner",
              "name": "githubOwner",
              "value": "bisco401",
              "type": "string"
            },
            {
              "id": "githubRepo",
              "name": "githubRepo",
              "value": "6ixo",
              "type": "string"
            },
            {
              "id": "githubBranch",
              "name": "githubBranch",
              "value": "main",
              "type": "string"
            },
            {
              "id": "csvPath",
              "name": "csvPath",
              "value": "data/scraped-listings.csv",
              "type": "string"
            },
            {
              "id": "defaultImportStatus",
              "name": "defaultImportStatus",
              "value": "pending",
              "type": "string"
            },
            {
              "id": "sourcesJson",
              "name": "sourcesJson",
              "value": "[\n  {\n    \"name\": \"Tonaton Vehicles\",\n    \"enabled\": true,\n    \"list_url\": \"https://tonaton.com/c_cars\",\n    \"base_url\": \"https://tonaton.com\",\n    \"target_surface\": \"vehicles\",\n    \"app_category\": \"vehicles\",\n    \"app_subcategory\": \"cars\",\n    \"rate_limit_seconds\": 3,\n    \"extractor_config\": {\n      \"cardPattern\": \"<a[\\\\s\\\\S]*?href=\\\"[^\\\"]*car[\\\\s\\\\S]*?<\\\\/a>\",\n      \"titlePattern\": \"<h[12][^>]*>([\\\\s\\\\S]*?)<\\\\/h[12]>\",\n      \"pricePattern\": \"(GH\u00a2|GHS)\\\\s?[0-9,.]+\",\n      \"urlPattern\": \"href=\\\"([^\\\"]+)\\\"\",\n      \"imagePattern\": \"<img[^>]+src=\\\"([^\\\"]+)\\\"\",\n      \"locationPattern\": \"Greater Accra[^<]*\",\n      \"country\": \"Ghana\"\n    }\n  }\n]",
              "type": "string"
            }
          ]
        },
        "options": {}
      },
      "id": "set-config",
      "name": "Set Config Here",
      "type": "n8n-nodes-base.set",
      "typeVersion": 3.4,
      "position": [
        460,
        320
      ]
    },
    {
      "parameters": {
        "jsCode": "const input = $input.first().json;\nconst OWNER = input.githubOwner || 'bisco401';\nconst REPO = input.githubRepo || '6ixo';\nconst BRANCH = input.githubBranch || 'main';\nconst CSV_PATH = input.csvPath || 'data/scraped-listings.csv';\nconst TOKEN = input.githubToken;\nconst DEFAULT_STATUS = input.defaultImportStatus || 'pending';\nlet SOURCES = [];\ntry {\n  SOURCES = typeof input.sourcesJson === 'string' ? JSON.parse(input.sourcesJson || '[]') : input.sourcesJson;\n} catch (error) {\n  throw new Error('sourcesJson is not valid JSON. Do not edit it unless you need to add more sources.');\n}\n\nif (!TOKEN || TOKEN === 'PASTE_GITHUB_TOKEN_HERE') throw new Error('Paste your GitHub token into the Set Config Here node.');\nif (!Array.isArray(SOURCES) || !SOURCES.length) throw new Error('The Set Config Here node needs at least one source in sourcesJson.');\n\nconst httpRequest = async (options) => {\n  const request = {\n    method: options.method || 'GET',\n    uri: options.url,\n    url: options.url,\n    headers: options.headers || {},\n    body: options.body,\n    json: options.json === true,\n    resolveWithFullResponse: true,\n    simple: false\n  };\n  try {\n    const response = await this.helpers.httpRequest(request);\n    const status = response.statusCode || response.status || 200;\n    const body = response.body ?? response;\n    return { status, ok: status >= 200 && status < 300, body };\n  } catch (error) {\n    const status = error.statusCode || error.status || error.response?.status || error.response?.statusCode || 500;\n    const body = error.response?.body || error.response?.data || error.message || '';\n    return { status, ok: status >= 200 && status < 300, body };\n  }\n};\n\nconst csvHeaders = ['id','status','target_surface','app_category','app_subcategory','title','price_text','price_value','currency','city','country','seller','phone','description','image_urls','source_site','source_url','scraped_at','make','model','trim','year','condition','transmission','color','mileage_km','attributes'];\nconst ghHeaders = { authorization: `Bearer ${TOKEN}`, accept: 'application/vnd.github+json', 'x-github-api-version': '2022-11-28' };\nconst sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));\nconst strip = (value = '') => String(value || '').replace(/<script[\\s\\S]*?<\\/script>/gi, ' ').replace(/<style[\\s\\S]*?<\\/style>/gi, ' ').replace(/<[^>]+>/g, ' ').replace(/&nbsp;/g, ' ').replace(/&amp;/g, '&').replace(/&#8373;|&#x20b5;/gi, 'GH\u00a2').replace(/\\s+/g, ' ').trim();\nconst rx = (pattern, flags = 'i') => pattern ? new RegExp(pattern, flags) : null;\nconst firstMatch = (text, pattern, flags = 'i') => {\n  const re = rx(pattern, flags);\n  if (!re) return '';\n  const match = String(text || '').match(re);\n  return strip(match?.[1] || match?.[0] || '');\n};\nconst absUrl = (value, base) => {\n  try { return new URL(String(value || ''), base || undefined).toString(); }\n  catch { return String(value || '').trim(); }\n};\nconst priceValue = (priceText = '') => {\n  const parsed = Number(String(priceText || '').replace(/[^0-9.]/g, ''));\n  return Number.isFinite(parsed) ? String(parsed) : '';\n};\nconst currency = (priceText = '') => {\n  const text = String(priceText || '');\n  if (/GH\u00a2|GHS/i.test(text)) return 'GHS';\n  if (/\\$/.test(text)) return 'USD';\n  if (/\u00a3/.test(text)) return 'GBP';\n  if (/\u20ac/.test(text)) return 'EUR';\n  return '';\n};\nconst guessYear = (text = '') => String(String(text || '').match(/\\b(19|20)\\d{2}\\b/)?.[0] || '');\nconst normalizeCondition = (value = '') => {\n  const text = String(value || '').toLowerCase();\n  if (/like new/.test(text)) return 'like_new';\n  if (/excellent/.test(text)) return 'excellent';\n  if (/good/.test(text)) return 'good';\n  if (/brand new|\\bnew\\b/.test(text)) return 'new';\n  if (/foreign used|local used|used|pre-owned|preowned/.test(text)) return 'used';\n  return '';\n};\nconst csvEscape = (value = '') => {\n  const text = String(value ?? '');\n  return /[\",\\n\\r]/.test(text) ? `\"${text.replace(/\"/g, '\"\"')}\"` : text;\n};\nconst parseCsv = (text = '') => {\n  const rows = [];\n  let row = [];\n  let cell = '';\n  let quoted = false;\n  const pushCell = () => { row.push(cell); cell = ''; };\n  const pushRow = () => { pushCell(); if (row.some((v) => String(v || '').trim())) rows.push(row); row = []; };\n  const input = String(text || '').replace(/^\\uFEFF/, '');\n  for (let i = 0; i < input.length; i += 1) {\n    const char = input[i];\n    const next = input[i + 1];\n    if (quoted) {\n      if (char === '\"' && next === '\"') { cell += '\"'; i += 1; }\n      else if (char === '\"') quoted = false;\n      else cell += char;\n    } else if (char === '\"') quoted = true;\n    else if (char === ',') pushCell();\n    else if (char === '\\n') pushRow();\n    else if (char !== '\\r') cell += char;\n  }\n  if (cell || row.length) pushRow();\n  const headers = (rows.shift() || csvHeaders).map((h) => String(h || '').trim());\n  return rows.map((values) => headers.reduce((acc, header, index) => { if (header) acc[header] = values[index] || ''; return acc; }, {}));\n};\nconst toCsv = (rows = []) => [csvHeaders.join(','), ...rows.map((row) => csvHeaders.map((header) => csvEscape(row[header] || '')).join(','))].join('\\n') + '\\n';\nconst getGithubCsv = async () => {\n  const url = `https://api.github.com/repos/${OWNER}/${REPO}/contents/${encodeURIComponent(CSV_PATH).replace(/%2F/g, '/')}?ref=${BRANCH}`;\n  const res = await httpRequest({ url, headers: ghHeaders, json: true });\n  if (res.status === 404) return { sha: null, text: csvHeaders.join(',') + '\\n' };\n  if (!res.ok) throw new Error(`Could not read ${CSV_PATH}: ${res.status} ${typeof res.body === 'string' ? res.body : JSON.stringify(res.body)}`);\n  const json = typeof res.body === 'string' ? JSON.parse(res.body) : res.body;\n  return { sha: json.sha, text: Buffer.from(json.content || '', 'base64').toString('utf8') };\n};\nconst putGithubCsv = async (text, sha) => {\n  const url = `https://api.github.com/repos/${OWNER}/${REPO}/contents/${encodeURIComponent(CSV_PATH).replace(/%2F/g, '/')}`;\n  const body = {\n    message: 'Update scraped listings CSV',\n    branch: BRANCH,\n    content: Buffer.from(text, 'utf8').toString('base64'),\n    ...(sha ? { sha } : {})\n  };\n  const res = await httpRequest({ url, method: 'PUT', headers: { ...ghHeaders, 'content-type': 'application/json' }, body, json: true });\n  if (!res.ok) throw new Error(`Could not update ${CSV_PATH}: ${res.status} ${typeof res.body === 'string' ? res.body : JSON.stringify(res.body)}`);\n  return typeof res.body === 'string' ? JSON.parse(res.body) : res.body;\n};\nconst extractListings = (source, html) => {\n  const config = source.extractor_config || {};\n  const base = source.base_url || source.list_url;\n  const cardRe = rx(config.cardPattern, 'gi');\n  const cards = cardRe ? Array.from(String(html || '').matchAll(cardRe)).map((m) => m[0]) : [html];\n  return cards.map((card) => {\n    const title = firstMatch(card, config.titlePattern || '<h[12][^>]*>([\\\\s\\\\S]*?)<\\\\/h[12]>', 'i') || firstMatch(card, config.fallbackTitlePattern || 'title=\"([^\"]+)\"', 'i');\n    const href = firstMatch(card, config.urlPattern || 'href=\"([^\"]+)\"', 'i');\n    const sourceUrl = absUrl(href || source.list_url, base);\n    const priceText = firstMatch(card, config.pricePattern || '(GH\u00a2|GHS|\\\\$|\u00a3|\u20ac)\\\\s?[0-9,.]+', 'i');\n    const image = firstMatch(card, config.imagePattern || '<img[^>]+src=\"([^\"]+)\"', 'i');\n    const location = firstMatch(card, config.locationPattern || '', 'i');\n    const description = firstMatch(card, config.descriptionPattern || '', 'i') || strip(card).slice(0, 280);\n    const attributes = {\n      employmentType: config.employmentType || '',\n      experienceLevel: config.experienceLevel || '',\n      remote: Boolean(config.remote),\n      tags: Array.isArray(config.tags) ? config.tags : []\n    };\n    const row = {\n      id: sourceUrl ? `csv-${Math.abs([...sourceUrl].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0))}` : `csv-${Date.now()}`,\n      status: source.default_status || DEFAULT_STATUS,\n      target_surface: source.target_surface || (source.app_category === 'vehicles' ? 'vehicles' : 'marketplace'),\n      app_category: source.app_category || 'electronics',\n      app_subcategory: source.app_subcategory || '',\n      title,\n      price_text: priceText,\n      price_value: priceValue(priceText),\n      currency: currency(priceText),\n      city: config.city || location.split(',')[0]?.trim() || '',\n      country: config.country || '',\n      seller: firstMatch(card, config.sellerPattern || '', 'i') || config.seller || source.name,\n      phone: firstMatch(card, config.phonePattern || '(\\\\+?\\\\d[\\\\d\\\\s().-]{7,}\\\\d)', 'i') || config.phone || '',\n      description,\n      image_urls: image ? absUrl(image, base) : '',\n      source_site: source.name || '',\n      source_url: sourceUrl,\n      scraped_at: new Date().toISOString(),\n      make: firstMatch(card, config.makePattern || '', 'i') || config.make || '',\n      model: firstMatch(card, config.modelPattern || '', 'i') || config.model || '',\n      trim: firstMatch(card, config.trimPattern || '', 'i') || config.trim || '',\n      year: firstMatch(card, config.yearPattern || '', 'i') || guessYear(`${title} ${description}`),\n      condition: firstMatch(card, config.conditionPattern || '', 'i') || normalizeCondition(`${title} ${description}`),\n      transmission: firstMatch(card, config.transmissionPattern || '(Automatic|Manual)', 'i'),\n      color: firstMatch(card, config.colorPattern || '', 'i') || config.color || '',\n      mileage_km: firstMatch(card, config.mileagePattern || '', 'i'),\n      attributes: JSON.stringify(attributes)\n    };\n    return row;\n  }).filter((row) => row.title && row.source_url);\n};\n\nconst existing = await getGithubCsv();\nconst rowsByUrl = new Map(parseCsv(existing.text).map((row) => [String(row.source_url || '').trim(), row]));\nconst summary = [];\nfor (const source of SOURCES.filter((item) => item && item.enabled !== false)) {\n  let fetched = 0;\n  let inserted = 0;\n  try {\n    const delay = Math.max(0, Number(source.rate_limit_seconds || 0)) * 1000;\n    if (delay) await sleep(delay);\n    const response = await httpRequest({ url: source.list_url, headers: { 'user-agent': '6ixo-listing-ingest/1.0 (+https://6ixo.com)' } });\n    const html = typeof response.body === 'string' ? response.body : JSON.stringify(response.body || '');\n    if (!response.ok) throw new Error(`Fetch failed: ${response.status}`);\n    if (/Just a moment|cf-mitigated|challenges.cloudflare.com/i.test(html)) throw new Error('Blocked by anti-bot challenge. Use API, export, saved public HTML, or screenshot/OCR for this source.');\n    const scraped = extractListings(source, html);\n    fetched = scraped.length;\n    for (const row of scraped) {\n      const key = String(row.source_url || '').trim();\n      if (!key) continue;\n      rowsByUrl.set(key, { ...(rowsByUrl.get(key) || {}), ...row });\n      inserted += 1;\n    }\n    summary.push({ source: source.name, status: 'success', fetched, inserted });\n  } catch (error) {\n    summary.push({ source: source.name, status: 'failed', fetched, inserted, error: error.message });\n  }\n}\nconst nextRows = Array.from(rowsByUrl.values()).sort((a, b) => String(b.scraped_at || '').localeCompare(String(a.scraped_at || '')));\nawait putGithubCsv(toCsv(nextRows), existing.sha);\nreturn summary.map((row) => ({ json: row }));"
      },
      "id": "scrape-to-csv",
      "name": "Scrape + Update CSV",
      "type": "n8n-nodes-base.code",
      "typeVersion": 2,
      "position": [
        760,
        320
      ]
    }
  ],
  "connections": {
    "Manual Trigger": {
      "main": [
        [
          {
            "node": "Set Config Here",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Every 6 Hours": {
      "main": [
        [
          {
            "node": "Set Config Here",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set Config Here": {
      "main": [
        [
          {
            "node": "Scrape + Update CSV",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  },
  "settings": {
    "executionOrder": "v1"
  },
  "staticData": null,
  "tags": [
    {
      "name": "6ixo"
    }
  ],
  "triggerCount": 0,
  "updatedAt": "2026-05-05T17:03:07.197Z",
  "versionId": "6ixo-csv-no-code-editing"
}
Pro

For the full experience including quality scoring and batch install features for each workflow upgrade to Pro

About this workflow

6ixo - Scrape Listings to CSV (No Code Editing). Event-driven trigger; 4 nodes.

Source: https://github.com/bisco401/6ixo/blob/bb19bf2ec1a7fb3df35667a2d28f7278169403bf/automations/n8n/6ixo-scrape-to-csv-github-no-code-edit.json — original creator credit. Request a take-down →

More Data & Sheets workflows → · Browse all categories →

Related workflows

Workflows that share integrations, category, or trigger type with this one. All free to copy and import.

Data & Sheets

Workflow 01.01. Uses notion, executeWorkflowTrigger, httpRequest. Event-driven trigger; 60 nodes.

Notion, Execute Workflow Trigger, HTTP Request
Data & Sheets

Lmchatopenai Workflow. Uses noOp, stickyNote, executeWorkflowTrigger, airtable. Event-driven trigger; 41 nodes.

Execute Workflow Trigger, Airtable, HTTP Request
Data & Sheets

This n8n workflow retrieves an Airtable record along with its related child records in a hierarchical structure. It can fetch up to 3 levels of linked records and assembles them into a comprehensive J

Execute Workflow Trigger, Airtable, HTTP Request
Data & Sheets

Automate sales call analysis and store structured insights in Notion with AI-powered intelligence.

Execute Workflow Trigger, Notion, HTTP Request
Data & Sheets

This workflow allows you to batch update/insert Airtable rows in groups of 10, significantly reducing the number of API calls and increasing performance. Copy the 3 Nodes Copy the three nodes inside t

HTTP Request, Debug Helper, Execute Workflow Trigger