Re: Massive table (500M rows) update nightmare

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Massive table (500M rows) update nightmare
Date: 2010-01-07 07:49:30
Message-ID: dcc563d11001062349y54bdbdcbm5d95fc5f08a332a8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Jan 7, 2010 at 12:17 AM, Carlo Stonebanks
<stonec(dot)register(at)sympatico(dot)ca> wrote:
> Our DB has an audit table which is 500M rows and growing. (FYI the objects
> being audited are grouped semantically, not individual field values).
>
> Recently we wanted to add a new feature and we altered the table to add a
> new column. We are backfilling this varchar(255) column by writing a TCL
> script to page through the rows (where every update is a UPDATE ... WHERE id
>>= x AND id < x+10 and a commit is performed after every 1000 updates
> statement, i.e. every 10000 rows.)
>
> We have 10 columns, six of which are indexed. Rough calculations suggest
> that this will take two to three weeks to complete on an 8-core CPU with
> more than enough memory.
>
> As a ballpark estimate - is this sort of performance for an 500M updates
> what one would expect of PG given the table structure (detailed below) or
> should I dig deeper to look for performance issues?

Got an explain analyze of the delete query?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Oleg Bartunov 2010-01-07 08:24:39 Re: Digesting explain analyze
Previous Message Michael Ruf 2010-01-07 07:49:00 Re: Optimizer use of index slows down query by factor