Re: UPDATE runs slow in a transaction

From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Viktor Rosenfeld" <rosenfel(at)informatik(dot)hu-berlin(dot)de>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE runs slow in a transaction
Date: 2008-07-16 14:58:29
Message-ID: 162867790807160758p45645403k2e3b92e474aa3e8e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

this is strange.

what means "run under transaction"?

you did exactly statements in psql console:
begin;
explain analyze select ...
commit?

regards
Pavel Stehule

2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
> Hi Pavel,
>
> thanks for the advice on how to uncorrelate the query. I must admit I
> didn't know about the UPDATE ... SET ... FROM ... syntax.
>
> Now the UPDATE runs in an acceptable time inside a transaction, however the
> query plan still differs when I run it outside.
>
> Outside a transaction:
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------
> Merge Join (cost=0.00..11481.84 rows=65756 width=1087) (actual
> time=0.151..323.856 rows=65756 loops=1)
> Merge Cond: (_struct.id = tmp.id)
> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7084.50
> rows=98149 width=1083) (actual time=0.028..137.463 rows=32300 loops=1)
> -> Index Scan using idx_tmp__id on tmp (cost=0.00..3330.02 rows=65756
> width=12) (actual time=0.115..58.601 rows=65756 loops=1)
> Total runtime: 2905.580 ms
>
> This looks like an optimal plan and average run time over 5 runs is 2660 ms.
>
> Inside a transaction:
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------
> Merge Join (cost=7427.63..16159.84 rows=65756 width=1087) (actual
> time=315.570..574.075 rows=65756 loops=1)
> Merge Cond: (_struct.id = tmp.id)
> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7500.50
> rows=98149 width=1083) (actual time=0.020..129.915 rows=32300 loops=1)
> -> Sort (cost=7427.63..7592.02 rows=65756 width=12) (actual
> time=315.538..333.359 rows=65756 loops=1)
> Sort Key: tmp.id
> Sort Method: quicksort Memory: 4617kB
> -> Seq Scan on tmp (cost=0.00..2165.56 rows=65756 width=12)
> (actual time=10.070..37.411 rows=65756 loops=1)
> Trigger for constraint _FK_struct_2_collection: time=1105.892 calls=32300
> Trigger for constraint _FK_struct_2_text: time=1468.009 calls=32300
> Total runtime: 4955.784 ms
>
> Again, the planner does not use the index on tmp (id) although I put an
> "ANALYZE tmp" right before the UPDATE. Average run time over 5 runs is 4610
> ms.
>
> Thanks,
> Viktor
>
> Am 16.07.2008 um 15:33 schrieb Pavel Stehule:
>
>> Hello
>>
>> my advice is little bit offtopic, I am sorry. Why you use correlated
>> subquery? Your update statement should be
>>
>> update _struct set left_token = tmp.left_token from tmp where
>> _struct.id = tmp.id;
>>
>> send output of explain analyze statement, please. etc
>> explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT
>> left_token FROM tmp WHERE _struct.id = tmp.id)
>>
>> regards
>> Pavel Stehule
>>
>>
>>
>> 2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
>>>
>>> Hi Tom,
>>> Postgres is indeed selecting a bad plan. Turns out that the index I
>>> created
>>> to speed up the UPDATE isn't used inside a transaction block.
>>> Here's the plan for "UPDATE _struct SET left_token = (SELECT DISTINCT
>>> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a transaction:
>>> QUERY PLAN
>>>
>>>
>>> -------------------------------------------------------------------------------------------
>>> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083)
>>> SubPlan
>>> -> Unique (cost=8.38..8.40 rows=1 width=4)
>>> -> Sort (cost=8.38..8.39 rows=4 width=4)
>>> Sort Key: tmp.left_token
>>> -> Index Scan using idx_tmp__id on tmp (cost=0.00..8.34
>>> rows=4 width=4)
>>> Index Cond: ($0 = id)
>>> And inside a transaction:
>>> QUERY PLAN
>>>
>>>
>>> ---------------------------------------------------------------------------------------
>>> Seq Scan on _struct (cost=100000000.00..3230175260746.00 rows=32300
>>> width=70)
>>> SubPlan
>>> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4)
>>> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4)
>>> Sort Key: tmp.left_token
>>> -> Seq Scan on tmp (cost=100000000.00..100002329.95
>>> rows=4 width=4)
>>> Filter: ($0 = id)
>>> The high cost of the seqscan on tmp are because I tried disabling
>>> sequential
>>> scans inside the transaction to force an index scan, which Postgres
>>> decided
>>> to ignore in this case.
>>> Putting an ANALYZE tmp and ANALYZE _struct right before the UPDATE didn't
>>> help either. (Also shouldn't the creation of an index on tmp (id) take
>>> care
>>> of analyzing that column?)
>>> Thanks,
>>> Viktor
>>> Am 14.07.2008 um 20:52 schrieb Tom Lane:
>>>
>>> Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de> writes:
>>>
>>> the script below runs very fast when executed alone. But when I call
>>>
>>> it from within a transaction block it's so slow that I have to abort
>>>
>>> it after a while. Specifically the second-to-last UPDATE seems to
>>>
>>> take forever within a transaction while it completes in about 3
>>>
>>> seconds outside a transaction.
>>>
>>> Since the table you're working on was just created in the same
>>> transaction, there's been no opportunity for autovacuum to run an
>>> ANALYZE on it; that's probably preventing selection of a good plan.
>>> Try throwing in an "ANALYZE tmp" after you load the table.
>>>
>>> regards, tom lane
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2008-07-16 15:01:18 Re: UPDATE runs slow in a transaction
Previous Message gabrielle 2008-07-16 14:54:54 Re: [GENERAL] [pdxpug] Pg booth staffing at OSCON