Re: slow update of index during insert/copy

From: Thomas Finneid <tfinneid(at)student(dot)matnat(dot)uio(dot)no>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: slow update of index during insert/copy
Date: 2008-09-01 11:29:50
Message-ID: 48BBD22E.4090308@ifi.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Craig Ringer wrote:
> Just on a side note, your system is pretty strangely heavy on CPU
> compared to its RAM and disk configuration. Unless your workload in Pg
> is computationally intensive or you have something else hosted on the
> same machine, those CPUs will probably sit mostly idle.

Its a devel machine for experimenting with pg and the disk performance
and for experimenting with multithreaded java programs. Its not going to
be particularily demanding on memory, but 8GB is good enough, I think.

> The first thing you need to do is determine where, during your bulk
> loads, the system is bottlenecked. I'd guess it's stuck waiting for disk
> writes, personally, but I'd want to investigate anyway.

Will investigate.

> If you're not satisfied with the results from pg_bulkload you can look
> into doing things like moving your indexes to separate tablespaces (so
> they don't fight for I/O on the same disk sets as your tables),
> separating your bulk load tables from other online/transactional tables,
> etc.

(Btw, its jdbc copy, not commandline.)
I dont think its the bulkload thats the problem, in it self, because
loading it without an index is quite fast (and 5 times faster than
ordinary insert). But of course, the bulkload process affects other
parts of the system which can cause a bottleneck.

> Also, to relay common advice from this list:
>
> If you land up considering hardware as a performance answer, getting a
> decent SAS RAID controller with a battery backed cache (so you can
> enable its write cache) and a set of fast SAS disks might be worth it.
> For that matter, a good SATA RAID controller and some 10kRPM SATA disks
> could help too. It all appears to depend a lot on the particular
> workload and the characteristics of the controller, though.

It does have a sata raid controller, but not have the battery pack,
because its a develmachine and not a production machine, I thought it
was not needed. But if you are saying the battery pack enables a cache
which enables faster disk writes I will consider it.
Its the first time I have worked with a raid controller, so I suspect I
have to read up on the features to understand how to utilise it best.

regards

thomas

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David West 2008-09-01 12:18:33 limit clause breaks query planner?
Previous Message Duan Ligong 2008-09-01 11:22:48 too many clog files