How to load data from CSV into a table that has array types in its columns?

From: Siddharth Jain <siddhsql(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: How to load data from CSV into a table that has array types in its columns?
Date: 2022-10-27 00:30:00
Message-ID: CAPqV3pRCQk=JRSZAGwgo-Uxs=5MD_7ZCViYdqTn0NyD_ZKrT6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

Given a long list like this:

1,2
2,4 --> 2 appears once
7,9
8,9
5,3
2,5 --> note 2 appears twice

I want to load it into this table:

create table tbl (
id integer primary key,
fids integer[]
)

so we will have 2 -> [4,5] where 2 is id and [4,5] are the fids

My actual dataset is very large and has 100M rows in it. How can it be
efficiently loaded into postgres?

I think I have to use a program for this and am trying to use the pg
library that comes with Node.js. I am reading the data in batches of 1M or
100k rows for example. I have created a dictionary in Node.js where I am
storing the hashmap. The part where I am stuck is how to generate the SQL
command?

1. I want to make one call to the server for the batch of 1M rows, not 1M
calls
2. Some keys in the hashmap might already exist in the database. For these
keys we want to append to the array
3. Some keys will not exist and for these we want to insert new rows

Can someone help me please? I am using Postgres for the first time.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2022-10-27 00:59:49 Re: How to load data from CSV into a table that has array types in its columns?
Previous Message Zheng Li 2022-10-26 20:39:24 Re: Support logical replication of DDLs