Re: UPDATE runs slow in a transaction

From: Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE runs slow in a transaction
Date: 2008-07-16 15:30:56
Message-ID: A7DBBEA9-4E49-4290-B63D-549D0FA7E986@informatik.hu-berlin.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Yes, I either run

begin;
... lots of other stuff in a script
explain analyze update ... (called from the same script)
rollback;

or

... lots of other stuff in a script (same as above)
explain analyze update ...

Cheers,
Viktor

Am 16.07.2008 um 16:58 schrieb Pavel Stehule:

> this is strange.
>
> what means "run under transaction"?
>
> you did exactly statements in psql console:
> begin;
> explain analyze select ...
> commit?
>
> regards
> Pavel Stehule
>
> 2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
>> Hi Pavel,
>>
>> thanks for the advice on how to uncorrelate the query. I must
>> admit I
>> didn't know about the UPDATE ... SET ... FROM ... syntax.
>>
>> Now the UPDATE runs in an acceptable time inside a transaction,
>> however the
>> query plan still differs when I run it outside.
>>
>> Outside a transaction:
>>
>>
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------------
>> Merge Join (cost=0.00..11481.84 rows=65756 width=1087) (actual
>> time=0.151..323.856 rows=65756 loops=1)
>> Merge Cond: (_struct.id = tmp.id)
>> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7084.50
>> rows=98149 width=1083) (actual time=0.028..137.463 rows=32300
>> loops=1)
>> -> Index Scan using idx_tmp__id on tmp (cost=0.00..3330.02
>> rows=65756
>> width=12) (actual time=0.115..58.601 rows=65756 loops=1)
>> Total runtime: 2905.580 ms
>>
>> This looks like an optimal plan and average run time over 5 runs is
>> 2660 ms.
>>
>> Inside a transaction:
>>
>>
>> QUERY PLAN
>> --------------------------------------------------------------------------------------------------------------------------------------------
>> Merge Join (cost=7427.63..16159.84 rows=65756 width=1087) (actual
>> time=315.570..574.075 rows=65756 loops=1)
>> Merge Cond: (_struct.id = tmp.id)
>> -> Index Scan using "_PK_struct" on _struct (cost=0.00..7500.50
>> rows=98149 width=1083) (actual time=0.020..129.915 rows=32300
>> loops=1)
>> -> Sort (cost=7427.63..7592.02 rows=65756 width=12) (actual
>> time=315.538..333.359 rows=65756 loops=1)
>> Sort Key: tmp.id
>> Sort Method: quicksort Memory: 4617kB
>> -> Seq Scan on tmp (cost=0.00..2165.56 rows=65756 width=12)
>> (actual time=10.070..37.411 rows=65756 loops=1)
>> Trigger for constraint _FK_struct_2_collection: time=1105.892
>> calls=32300
>> Trigger for constraint _FK_struct_2_text: time=1468.009 calls=32300
>> Total runtime: 4955.784 ms
>>
>> Again, the planner does not use the index on tmp (id) although I
>> put an
>> "ANALYZE tmp" right before the UPDATE. Average run time over 5
>> runs is 4610
>> ms.
>>
>> Thanks,
>> Viktor
>>
>> Am 16.07.2008 um 15:33 schrieb Pavel Stehule:
>>
>>> Hello
>>>
>>> my advice is little bit offtopic, I am sorry. Why you use correlated
>>> subquery? Your update statement should be
>>>
>>> update _struct set left_token = tmp.left_token from tmp where
>>> _struct.id = tmp.id;
>>>
>>> send output of explain analyze statement, please. etc
>>> explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT
>>> left_token FROM tmp WHERE _struct.id = tmp.id)
>>>
>>> regards
>>> Pavel Stehule
>>>
>>>
>>>
>>> 2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
>>>>
>>>> Hi Tom,
>>>> Postgres is indeed selecting a bad plan. Turns out that the
>>>> index I
>>>> created
>>>> to speed up the UPDATE isn't used inside a transaction block.
>>>> Here's the plan for "UPDATE _struct SET left_token = (SELECT
>>>> DISTINCT
>>>> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a
>>>> transaction:
>>>> QUERY PLAN
>>>>
>>>>
>>>> -------------------------------------------------------------------------------------------
>>>> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083)
>>>> SubPlan
>>>> -> Unique (cost=8.38..8.40 rows=1 width=4)
>>>> -> Sort (cost=8.38..8.39 rows=4 width=4)
>>>> Sort Key: tmp.left_token
>>>> -> Index Scan using idx_tmp__id on tmp
>>>> (cost=0.00..8.34
>>>> rows=4 width=4)
>>>> Index Cond: ($0 = id)
>>>> And inside a transaction:
>>>> QUERY PLAN
>>>>
>>>>
>>>> ---------------------------------------------------------------------------------------
>>>> Seq Scan on _struct (cost=100000000.00..3230175260746.00
>>>> rows=32300
>>>> width=70)
>>>> SubPlan
>>>> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4)
>>>> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4)
>>>> Sort Key: tmp.left_token
>>>> -> Seq Scan on tmp (cost=100000000.00..100002329.95
>>>> rows=4 width=4)
>>>> Filter: ($0 = id)
>>>> The high cost of the seqscan on tmp are because I tried disabling
>>>> sequential
>>>> scans inside the transaction to force an index scan, which Postgres
>>>> decided
>>>> to ignore in this case.
>>>> Putting an ANALYZE tmp and ANALYZE _struct right before the
>>>> UPDATE didn't
>>>> help either. (Also shouldn't the creation of an index on tmp
>>>> (id) take
>>>> care
>>>> of analyzing that column?)
>>>> Thanks,
>>>> Viktor
>>>> Am 14.07.2008 um 20:52 schrieb Tom Lane:
>>>>
>>>> Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de> writes:
>>>>
>>>> the script below runs very fast when executed alone. But when I
>>>> call
>>>>
>>>> it from within a transaction block it's so slow that I have to
>>>> abort
>>>>
>>>> it after a while. Specifically the second-to-last UPDATE seems to
>>>>
>>>> take forever within a transaction while it completes in about 3
>>>>
>>>> seconds outside a transaction.
>>>>
>>>> Since the table you're working on was just created in the same
>>>> transaction, there's been no opportunity for autovacuum to run an
>>>> ANALYZE on it; that's probably preventing selection of a good plan.
>>>> Try throwing in an "ANALYZE tmp" after you load the table.
>>>>
>>>> regards, tom lane
>>>>
>>>> --
>>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-general
>>>>
>>>>
>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Mason Hale 2008-07-16 15:40:58 Re: vacuum taking an unusually long time
Previous Message Viktor Rosenfeld 2008-07-16 15:28:36 Re: UPDATE runs slow in a transaction