Re: slow table updates

From: Richard Huxton <dev(at)archonet(dot)com>
To: Reece Hart <rkh(at)gene(dot)COM>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>, SF PostgreSQL <sfpug(at)postgresql(dot)org>
Subject: Re: slow table updates
Date: 2003-07-23 08:49:09
Message-ID: 200307230949.10028.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-performance sfpug

On Wednesday 23 July 2003 01:40, Reece Hart wrote:
> I'm trying to update a table but it's taking a very long time. I would
> appreciate any tips folks may have about ways to speed it up.
[snip]
> paprospect2 contains ~40M rows. The goal now is to migrate the data to
> the supertable-inherited column with
>
> update paprospect2 set run_id_new=run_id;
>
>
> The update's been running for 5 hours (unloaded dual 2.4 GHz Xeon w/2GB
> RAM, SCSI160 10K drive). There are no other jobs running. Load is ~1.2
> and the update's using ~3-5% of the CPU.
[snip]
> This suggests that the update is I/O bound (duh) and vmstat supports
> this:
[snip]
> Presumably the large number of blocks written (bo) versus blocks read
> (bi) reflects an enormous amount of bookkeeping that has to be done for
> MVCC, logging, perhaps rewriting a row for the new definition (a guess
> -- I don't know how this is handled), indicies, etc. There's no swapping
> and no processes are waiting. In short, it seems that this is ENTIRELY
> an I/O issue. Obviously, faster drives will help (but probably only by
> small factor).
>
> Any ideas how I might speed this up? Presumably this is all getting
> wrapped in a transaction -- does that hurt me for such a large update?

Well, it needs to keep enought bookkeeping to be able to rollback the whole
transaction if it encounters a problem, or 40M rows in your case. Looks like
you're right and it's an I/O issue. I must admit, I'm a bit puzzled that your
CPU is quite so low, but I suppose you've got two fast CPUs so it shouldn't
be high.

[note the following is more speculation than experience]
What might be happening is that the drive is spending all its time seeking
between the WAL, index and table as it updates. I would also tend to be
suspicious of the foreign keys - PG might be re-checking these, and obviously
that would take time too.

What you might want to try in future:
1. begin transaction
2. drop indexes, foreign keys
3. update table
4. vacuum it
5. recreate indexes, foreign keys etc
6. commit

Now that's just moving the index updating/fk stuff to the end of the task, but
it does seem to help sometimes.

HTH
--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Marvin 2003-07-23 14:28:02 OID's
Previous Message Renney Thomas 2003-07-23 08:23:19 compile error

Browse pgsql-performance by date

  From Date Subject
Next Message Shridhar Daithankar 2003-07-23 09:54:01 Re: Performance hit of foreign key constraints?
Previous Message Jean-Christian Imbeault 2003-07-23 07:05:00 Performance hit of foreign key constraints?

Browse sfpug by date

  From Date Subject
Next Message Reece Hart 2003-07-23 17:44:36 Re: [PERFORM] slow table updates
Previous Message Reece Hart 2003-07-23 00:40:01 slow table updates