Re: 8.x index insert performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kelly Burkhart <kelly(at)tradebotsystems(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 8.x index insert performance
Date: 2005-11-10 23:01:38
Message-ID: 12582.1131663698@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Kelly Burkhart <kelly(at)tradebotsystems(dot)com> writes:
> On Thu, 2005-11-10 at 17:18 -0500, Tom Lane wrote:
>> Could you send me (off list) the complete gprof output files?

> Sure,

Thanks. Right offhand I can see no smoking gun here. The
pg_detoast_datum entry I was worried about seems to be just measurement
noise --- the gprof trace shows that it's called a proportional number
of times in both cases, and it falls through without actually doing
anything in all cases.

The later trace involves a slightly larger amount of time spent
inserting into the indexes, which is what you'd expect as the indexes
get bigger, but it doesn't seem that CPU time per se is the issue.
The just-before-the-cliff trace shows total CPU of 5.38 sec and the
after-the-cliff one shows 6.61 sec.

What I now suspect is happening is that you "hit the wall" at the point
where the indexes no longer fit into main memory and it starts taking
significant I/O to search and update them. Have you tried watching
iostat or vmstat output to see if there's a noticeable increase in I/O
at the point where things slow down? Can you check the physical size of
the indexes at that point, and see if it seems related to your available
RAM?

If that is the correct explanation, then the only solutions I can see
are (1) buy more RAM or (2) avoid doing incremental index updates;
that is, drop the indexes before bulk load and rebuild them afterwards.

One point to consider is that an index will be randomly accessed only
if its data is being loaded in random order. If you're loading keys in
sequential order then only the "right-hand edge" of the index would get
touched, and it wouldn't need much RAM. So, depending on what order
you're loading data in, the primary key index may not be contributing
to the problem.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Charlie Savage 2005-11-11 00:11:48 Re: Index Scan Costs versus Sort
Previous Message Tom Lane 2005-11-10 22:18:14 Re: 8.x index insert performance