Re: Massive table (500M rows) update nightmare

From: Leo Mannhart <leo(dot)mannhart(at)beecom(dot)ch>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Massive table (500M rows) update nightmare
Date: 2010-01-07 13:14:52
Message-ID: 4B45DE4C.4050606@beecom.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Carlo Stonebanks wrote:
> Our DB has an audit table which is 500M rows and growing. (FYI the
> objects being audited are grouped semantically, not individual field
> values).
>
> Recently we wanted to add a new feature and we altered the table to add
> a new column. We are backfilling this varchar(255) column by writing a
> TCL script to page through the rows (where every update is a UPDATE ...
> WHERE id >= x AND id < x+10 and a commit is performed after every 1000
> updates statement, i.e. every 10000 rows.)
>
> We have 10 columns, six of which are indexed. Rough calculations suggest
> that this will take two to three weeks to complete on an 8-core CPU with
> more than enough memory.
>
> As a ballpark estimate - is this sort of performance for an 500M updates
> what one would expect of PG given the table structure (detailed below)
> or should I dig deeper to look for performance issues?
>
> As always, thanks!
>
> Carlo
>

If it is possible to lock this audit table exclusively (may be during
off peak hours) I would look into
- create new_audit_table as select col1, col2, col3 ... col9,
'new_col_value' from old_audit_table;
- create all indexes
- drop old_audit_table
- rename new_audit_table to old_audit_table

That is probably the fasted method you can do, even if you have to join
the "new_col_value" from an extra helper-table with the correspondig id.
Remeber, databases are born to join.

You could also try to just update the whole table in one go, it is
probably faster than you expect.

just a thought
Leo

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gurgel, Flavio 2010-01-07 13:24:24 Re: Air-traffic benchmark
Previous Message Lefteris 2010-01-07 12:38:41 Air-traffic benchmark