From: | Alban Hertroys <haramrae(at)gmail(dot)com> |
---|---|
To: | Job <Job(at)colliniconsulting(dot)it> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Insert large number of records |
Date: | 2017-09-19 16:09:33 |
Message-ID: | D653223D-1492-4F6C-AB54-689F4EF7E4C8@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> On 19 Sep 2017, at 15:47, Job <Job(at)colliniconsulting(dot)it> wrote:
>
> Hi guys,
>
> we need to insert from a table to another (Postgresql 9.6.1) a large amount of data (about 10/20 millions of rows) without locking destination table.
> Pg_bulkload is the fastest way but it locks the table.
>
> Are there other ways?
> Classic "COPY" from?
We do something like that using a staging table to load to initially (although not bulk; data arrives in our staging table with batches of 5k to 100k rows) and then we transfer the data using insert/select and "on conflict do".
That data-transfer within PG takes a couple of minutes on our rather limited VM for a wide 37M rows table (~37GB on disk). That only locks the staging table (during initial bulkload) and the rows in the master table that are currently being altered (during the insert/select).
If your data-source is a file in a format supported by COPY, then you can use COPY to do the initial bulk load into the staging table.
Some benefits of this 2-stage approach are that it leaves room to manipulate the data (type conversions, for example) and that it can handle the scenario where a matching target record in the master table already exists. In our case, we convert character fields to varchar (which saves a lot of space(s)).
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.
From | Date | Subject | |
---|---|---|---|
Next Message | chiru r | 2017-09-19 16:57:04 | USER Profiles for PostgreSQL |
Previous Message | Yason TR | 2017-09-19 14:53:28 | Re: JDBC: logical replication and LSN feedback |