Re: Massive table (500M rows) update nightmare

From: marcin mank <marcin(dot)mank(at)gmail(dot)com>
To: Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Massive table (500M rows) update nightmare
Date: 2010-01-07 21:05:14
Message-ID: b1b9fac61001071305vf182f3ajff6827f92c943c68@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> every update is a UPDATE ... WHERE id
>>= x AND id < x+10 and a commit is performed after every 1000 updates
> statement, i.e. every 10000 rows.

What is the rationale behind this? How about doing 10k rows in 1
update, and committing every time?

You could try making the condition on the ctid column, to not have to
use the index on ID, and process the rows in physical order. First
make sure that newly inserted production data has the correct value in
the new column, and add 'where new_column is null' to the conditions.
But I have never tried this, use at Your own risk.

Greetings
Marcin Mank

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Carlo Stonebanks 2010-01-07 21:48:19 Re: Massive table (500M rows) update nightmare
Previous Message Robert Haas 2010-01-07 20:19:57 Re: noob inheritance question