Skip site navigation (1) Skip section navigation (2)

Re: Long Running Update - My Solution

From: Harry Mantheakis <harry(dot)mantheakis(at)riskcontrollimited(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Long Running Update - My Solution
Date: 2011-06-27 15:02:02
Message-ID: 4E089B6A.4000208@riskcontrollimited.com (view raw or flat)
Thread:
Lists: pgsql-performance
I am glad to report that the 'salami-slice' approach worked nicely - all 
done in about 2.5 hours.

Instead of using an all-in-one-go statement, we executed 800 statements, 
each updating 100,000 records. On average it tool about 10-seconds for 
each statement to return.

This is "thinking out of the box" solution, which others might not be 
able to emulate.

The mystery remains, for me: why updating 100,000 records could complete 
in as quickly as 5 seconds, whereas an attempt to update a million 
records was still running after 25 minutes before we killed it?

One thing remains crystal clear: I love Postgresql :-)

Kind regards

Harry Mantheakis
London, UK


On 23/06/2011 16:05, Harry Mantheakis wrote:
> Hello
>
> I am attempting to run an update statement that copies two fields from 
> one table to another:
>
>
> UPDATE
>   table_A
> SET
> (
>   field_1
> , field_2
> ) = (
> table_B.field_1
> , table_B.field_2
> )
> FROM
> table_B
> WHERE
> table_B.id = table_A.id
> ;
>
>
> Table "table_B" contains almost 75 million records, with IDs that 
> match those in "table_A".
>
> Both "field_1" and "field_2" are DOUBLE PRECISION. The ID fields are 
> SERIAL primary-key integers in both tables.
>
> I tested (the logic of) this statement with a very small sample, and 
> it worked correctly.
>
> The database runs on a dedicated Debian server in our office.
>
> I called both VACUUM and ANALYZE on the databased before invoking this 
> statement.
>
> The statement has been running for 18+ hours so far.
>
> TOP, FREE and VMSTAT utilities indicate that only about half of the 
> 6GB of memory is being used, so I have no reason to believe that the 
> server is struggling.
>
> My question is: can I reasonably expect a statement like this to 
> complete with such a large data-set, even if it takes several days?
>
> We do not mind waiting, but obviously we do not want to wait 
> unnecessarily.
>
> Many thanks.
>
> Harry Mantheakis
> London, UK
>
>
>

In response to

Responses

pgsql-performance by date

Next:From: Kevin GrittnerDate: 2011-06-27 15:12:25
Subject: Re: Long Running Update - My Solution
Previous:From: JenishDate: 2011-06-27 15:01:37
Subject: Re: Performance issue with Insert

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group