Re: Updating a very large table

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-admin(at)postgresql(dot)org
Subject: Re: Updating a very large table
Date: 2009-04-25 05:26:29
Message-ID: 49F29F05.30805@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Kevin Grittner wrote:
> Chris Browne <cbbrowne(at)acm(dot)org> wrote:
>
>> I'd suggest adding an index
>
> The OP said the table had 15 indexes already. I would guess one of
> those could be used. Perhaps it has a primary key....
>
>> update table1 set new_column = [whatever calculation]
>> where new_column is null and
>> quasi_unique_column in
>> (select quasi_unique_column from table1
>> where new_column is null limit 1000);
>
> Or, if the primary key (or other unique or quasi-unique existing
> index) has multiple columns, this could still be done with:
>
> update table1 set new_column = [whatever calculation]
> where new_column is null and
> (col1, col2) in
> (select col1, col2 from table1
> where new_column is null limit 1000);
>

Would doing something with ctid be even better?
Or does it have some risks I'm missing. I'm thinking
something like:

fli=# select max(ctid) from table1;
max
-------------
(183000,42)
(1 row)

Then

update table set new_column=[whatever] where ctid<'(10000,1)';
vacuum;
update table set new_column=[whatever] where ctid>'(10000,1)' and ctid<'(20000,1');
vacuum;
...
update table set new_column=[whatever] where ctid>'(180000,1)';
vacuum;

and perhaps a final

update table set new_column=[whatever] where new_column is null;

to catch any this might have missed?

Seems this makes it easer to control how much the table will
bloat too -- if I only want it to bloat 5% I divide max(ctid) by 20
for each group size....

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2009-04-25 13:59:39 Re: Updating a very large table
Previous Message Dot Yet 2009-04-25 04:20:08 Re: How to map columns in pg_stat_activity to windows PID