Re: Slow update statement

From: Patrick Hatcher <pathat(at)comcast(dot)net>
To: John A Meinel <john(at)arbash-meinel(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow update statement
Date: 2005-08-08 02:00:51
Message-ID: 42F6BCD3.4070600@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry went out of town for the weekend. The update did occur, but I
have no idea when it finished.

Here's the actual query and the explain Update:
cdm.bcp_ddw_ck_cus = 12.7 M
cdm.cdm_ddw_customer = 12.8M

explain
update cdm.cdm_ddw_customer
set indiv_fkey = b.indiv_fkey
from cdm.bcp_ddw_ck_cus b
where
cdm.cdm_ddw_customer.cus_nbr = b.cus_num;

Hash Join (cost=1246688.42..4127248.31 rows=12702676 width=200)
Hash Cond: ("outer".cus_num = "inner".cus_nbr)
-> Seq Scan on bcp_ddw_ck_cus b (cost=0.00..195690.76 rows=12702676
width=16)
-> Hash (cost=874854.34..874854.34 rows=12880834 width=192)
-> Seq Scan on cdm_ddw_customer (cost=0.00..874854.34
rows=12880834 width=192)

John A Meinel wrote:

>Patrick Hatcher wrote:
>
>
>>[Reposted from General section with updated information]
>>Pg 7.4.5
>>
>>I'm running an update statement on about 12 million records using the
>>following query:
>>
>>Update table_A
>>set F1 = b.new_data
>>from table_B b
>>where b.keyfield = table_A.keyfield
>>
>>both keyfields are indexed, all other keys in table_A were dropped, yet
>>this job has been running over 15 hours. Is
>>this normal?
>>
>>
>
>Can you do an EXPLAIN UPDATE so that we can have an idea what the
>planner is trying to do?
>
>My personal concern is if it doing something like pulling in all rows
>from b, and then one by one updating table_A, but as it is going, it
>can't retire any dead rows, because you are still in a transaction. So
>you are getting a lot of old rows, which it has to pull in to realize it
>was old.
>
>How many rows are in table_B?
>
>I can see that possibly doing it in smaller chunks might be faster, as
>would inserting into another table. But I would do more of a test and
>see what happens.
>
>John
>=:->
>
>
>
>>I stopped the process the first time after 3 hours of running due to
>>excessive log rotation and reset the conf file to these settings:
>>
>>
>>wal_buffers = 64 # min 4, 8KB each
>>
>># - Checkpoints -
>>
>>checkpoint_segments = 128 # in logfile segments, min 1, 16MB each
>>checkpoint_timeout = 1800 # range 30-3600, in seconds
>>#checkpoint_warning = 30 # 0 is off, in seconds
>>#commit_delay = 0 # range 0-100000, in microseconds
>>#commit_siblings = 5 # range 1-1000
>>
>>
>>Would it just be quicker to run a JOIN statement to a temp file and then
>>reinsert?
>>TIA Patrick
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: In versions below 8.0, the planner will ignore your desire to
>> choose an index scan if your joining column's datatypes do not
>> match
>>
>>
>>
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Patrick Hatcher 2005-08-08 02:09:04 Re: Slow update statement
Previous Message Luke Lonergan 2005-08-07 05:21:03 Re: COPY FROM performance improvements