Re: Need to update all my 60 million rows at once without transactional integrity

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: christian_behrens(at)gmx(dot)net
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need to update all my 60 million rows at once without transactional integrity
Date: 2008-04-23 16:32:02
Message-ID: 480F6482.3020005@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

christian_behrens(at)gmx(dot)net wrote:
> Hi!
>
> How can I make a Update of a column in a very large table for all rows without using the double amount of disc space and without any need for atomic operation?
>
> I have a very large table with about 60 million rows. I sometimes need to do a simple update to ALL rows that resets a status-flag to zero.
>
Without knowing details of your app, here are some random thoughts:

1. Use a where clause. If the number of non-zero status flags is small
portion of the table, then the table will only grow by the number of
flags that need to be reset, not the whole table.

2. Split the flag into a separate table. You will have index overhead, but:

2a. You can reset by a simple truncate followed by an INSERT into
flagtable (id,flag) SELECT rowid,0 from yourmaintable.

2b. Even if (not-recommended) you did a full update of the flagtable,
you would only be growing the usage by the size of the flagtable.

2c. You may be able to have the flagtable only store non-zero flags in
which case you could use a coalesce((SELECT flag from flagtable where
flagtable.id=manitable.id),0) to fetch the flag. Then a reset is just a
near-instantaneous truncate.

3. Partition your table - use inheritance to create a main table
consisting of many children containing the data. Depending on your app,
there may be other benefits to partitioning. But in any case, you can
update one child-table at a time. Follow the update of each sub-table
with a CLUSTER which is far faster than VACUUM FULL.

Cheers,
Steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Devrim GÜNDÜZ 2008-04-23 17:37:46 Re: PG Yum Repo - can't Find Slony1
Previous Message Tim Tassonis 2008-04-23 16:22:10 Re: initdb in 8.3