Re: Insert performance vs Table size

From: Jacques Caron <jc(at)directinfos(dot)com>
To: "Praveen Raja" <praveen(dot)raja(at)netlight(dot)se>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Insert performance vs Table size
Date: 2005-06-28 10:43:47
Message-ID: 6.2.0.14.0.20050628123516.03ac59e0@wheresmymailserver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

At 11:50 28/06/2005, Praveen Raja wrote:
>I assume you took size to mean the row size?

Nope, the size of the table.

> What I really meant was
>does the number of rows a table has affect the performance of new
>inserts into the table (just INSERTs) all other things remaining
>constant. Sorry for the confusion.

As I said previously, in most cases it does. One of the few cases where it
doesn't would be an append-only table, no holes, no indexes, no foreign keys...

>I know that having indexes on the table adds an overhead but again does
>this overhead increase (for an INSERT operation) with the number of rows
>the table contains?

It depends on what you are indexing. If the index key is something that
grows monotonically (e.g. a unique ID or a timestamp), then the size of the
table (and hence of the indexes) should have a very limited influence on
the INSERTs. If the index key is anything else (and that must definitely be
the case if you have 7 or 8 indexes!), then that means updates will happen
all over the indexes, which means a lot of read and write activity, and
once the total size of your indexes exceeds what can be cached in RAM,
performance will decrease quite a bit. Of course if your keys are
concentrated in a few limited areas of the key ranges it might help.

>My instinct says no to both. If I'm wrong can someone explain why the
>number of rows in a table affects INSERT performance?

As described above, maintaining indexes when you "hit" anywhere in said
indexes is very costly. The larger the table, the larger the indexes, the
higher the number of levels in the trees, etc. As long as it fits in RAM,
it shouldn't be a problem. Once you exceed that threshold, you start
getting a lot of random I/O, and that's expensive.

Again, it depends a lot on your exact schema, the nature of the data, the
spread of the different values, etc, but I would believe it's more often
the case than not.

Jacques.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Andrew Dunstan 2005-06-28 12:53:50 Re: COPY FROM performance improvements
Previous Message Praveen Raja 2005-06-28 09:50:08 Re: Insert performance vs Table size