Re: very very slow inserts into very large table

From: Mark Thornton <mthornton(at)optrak(dot)com>
To: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: very very slow inserts into very large table
Date: 2012-07-16 18:59:07
Message-ID: 5004647B.3090308@optrak.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 16/07/12 18:56, Jon Nelson wrote:
> It is not my intent to
> insult or even disparage my favorite software, but it took less time
> to*build* the indices for 550GB of data than it would have to insert
> 1/20th as much. That doesn't seem right.
My explanation would apply to many databases, not just Postgres.

To speed up the insert there are a number of possible approaches:

1. Partition the data and then sort the temporary table into groups
based on the partitioning. Best of all if all the new data goes into a
single partition.

2. Drop the indexes before insert and rebuild afterwards.

3. Reduce the number of indexes. If you only have one index, you can
sort the data to be inserted in the natural order of the index. If you
must have more than one index you could still sort the new data in the
order of one of them to obtain a modest improvement in locality.

4. The most efficient way for the database itself to do the updates
would be to first insert all the data in the table, and then update each
index in turn having first sorted the inserted keys in the appropriate
order for that index.

Mark

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-07-16 19:03:03 Re: [PERFORM] DELETE vs TRUNCATE explanation
Previous Message Robert Haas 2012-07-16 18:39:58 Re: [PERFORM] DELETE vs TRUNCATE explanation