Re: Need to update all my 60 million rows at once without transactional integrity

From: Tomasz Ostrowski <tometzky(at)batory(dot)org(dot)pl>
To: christian_behrens(at)gmx(dot)net
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need to update all my 60 million rows at once without transactional integrity
Date: 2008-04-23 11:49:30
Message-ID: 480F224A.50706@batory.org.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2008-04-21 00:19, christian_behrens(at)gmx(dot)net wrote:

> I have a very large table with about 60 million rows. I sometimes
> need to do a simple update to ALL rows that resets a status-flag to
> zero.
> UPDATE table SET flag=0;

First optimization:
UPDATE table SET flag=0 where flag!=0;

Second optimization:

> If I do a batched loop like this:
> UPDATE table SET flag=0 where id>=0 and id <200;
> UPDATE table SET flag=0 where id>=200 and id <400;
> UPDATE table SET flag=0 where id>=400 and id <600;
> then PG will seek all over my harddrive I think.

Loop like this (in pseudo-code):

non0 = select count(*) from table where flag!=0;
batchsize = 1000000;
for ( i=0; i<non0; i+=batchsize) {
update table set flag=0 where id in
(select id from table where flag!=0 limit batchsize);
commit;
vacuum table;
analyze table;
}

You'll use only 1/60 of space. Will need about 3*60 table scans. But it
should not seek too much.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gregory Stark 2008-04-23 12:12:02 Re: Bitmap Heap Scan takes a lot of time
Previous Message Pascal Cohen 2008-04-23 11:39:47 Re: Deny creation of tables for a user