Re: Bulk Insert into PostgreSQL

From: Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Don Seiler <don(at)seiler(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: Bulk Insert into PostgreSQL
Date: 2018-07-03 23:34:31
Message-ID: CAEfuzeRWJufow_pn7bPJ8MB9jzUNpHn0piY6xiv+NBhTbRtFuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

@Peter: I was indexing the primary key of all the tables in tpc-ds. Some of
the fact tables has multiple columns as part of the primary key. Also, most
of them are numeric type.

On Mon, Jul 2, 2018 at 7:09 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Sun, Jul 1, 2018 at 5:19 PM, Tsunakawa, Takayuki
> <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> > 400 GB / 15 hours = 7.6 MB/s
> >
> > That looks too slow. I experienced a similar slowness. While our user
> tried to INSERT (not COPY) a billion record, they reported INSERTs slowed
> down by 10 times or so after inserting about 500 million records. Periodic
> pstack runs on Linux showed that the backend was busy in btree operations.
> I didn't pursue the cause due to other businesses, but there might be
> something to be improved.
>
> What kind of data was indexed? Was it a bigserial primary key, or
> something else?
>
> --
> Peter Geoghegan
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-07-03 23:44:21 Re: Cache invalidation after authentication (on-the-fly role creation)
Previous Message Nikita Glukhov 2018-07-03 23:21:08 Re: [HACKERS] [PATCH] kNN for SP-GiST