Re: Bulk Insert into PostgreSQL

From: Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com>
To: Don Seiler <don(at)seiler(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: Bulk Insert into PostgreSQL
Date: 2018-06-29 22:47:42
Message-ID: CAEfuzeSHg9d3C6F5SyJRSQDFXyh-xhqjp91=n+2B-rN_oomSOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I was using copy command to load. Removing the primary key constraint on
the table and then loading it helps a lot. In fact, a 400GB table was
loaded and the primary constraint was added in around 15 hours. Thanks for
the wonderful suggestions.

Regards,
Srinivas Karthik

On 28 Jun 2018 2:07 a.m., "Don Seiler" <don(at)seiler(dot)us> wrote:

> On Wed, Jun 27, 2018 at 6:25 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>>
>>
>>> Other parameters are set to default value. Moreover, I have specified
>>> the primary key constraint during table creation. This is the only possible
>>> index being created before data loading and I am sure there are no other
>>> indexes apart from the primary key column(s).
>>>
>>
> When doing initial bulk data loads, I would suggest not applying ANY
> constraints or indexes on the table until after the data is loaded.
> Especially unique constraints/indexes, those will slow things down A LOT.
>
>
>>
>> The main factor is using COPY instead INSERTs.
>>
>>
> +1 to COPY.
>
>
> --
> Don Seiler
> www.seiler.us
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-06-29 22:50:55 Re: pgsql: Fix "base" snapshot handling in logical decoding
Previous Message Alvaro Herrera 2018-06-29 22:38:36 Re: Explain buffers wrong counter with parallel plans