Re: UPDATE runs slow in a transaction

From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: "Viktor Rosenfeld" <rosenfel(at)informatik(dot)hu-berlin(dot)de>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: UPDATE runs slow in a transaction
Date: 2008-07-16 13:33:56
Message-ID: 162867790807160633u53b6119fj17402a2a6f2deb8a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello

my advice is little bit offtopic, I am sorry. Why you use correlated
subquery? Your update statement should be

update _struct set left_token = tmp.left_token from tmp where
_struct.id = tmp.id;

send output of explain analyze statement, please. etc
explain analyze UPDATE _struct SET left_token = (SELECT DISTINCT
left_token FROM tmp WHERE _struct.id = tmp.id)

regards
Pavel Stehule

2008/7/16 Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de>:
> Hi Tom,
> Postgres is indeed selecting a bad plan. Turns out that the index I created
> to speed up the UPDATE isn't used inside a transaction block.
> Here's the plan for "UPDATE _struct SET left_token = (SELECT DISTINCT
> left_token FROM tmp WHERE _struct.id = tmp.id)" outside of a transaction:
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------
> Seq Scan on _struct (cost=0.00..826643.13 rows=98149 width=1083)
> SubPlan
> -> Unique (cost=8.38..8.40 rows=1 width=4)
> -> Sort (cost=8.38..8.39 rows=4 width=4)
> Sort Key: tmp.left_token
> -> Index Scan using idx_tmp__id on tmp (cost=0.00..8.34
> rows=4 width=4)
> Index Cond: ($0 = id)
> And inside a transaction:
> QUERY PLAN
>
> ---------------------------------------------------------------------------------------
> Seq Scan on _struct (cost=100000000.00..3230175260746.00 rows=32300
> width=70)
> SubPlan
> -> Unique (cost=100002329.99..100002330.01 rows=1 width=4)
> -> Sort (cost=100002329.99..100002330.00 rows=4 width=4)
> Sort Key: tmp.left_token
> -> Seq Scan on tmp (cost=100000000.00..100002329.95
> rows=4 width=4)
> Filter: ($0 = id)
> The high cost of the seqscan on tmp are because I tried disabling sequential
> scans inside the transaction to force an index scan, which Postgres decided
> to ignore in this case.
> Putting an ANALYZE tmp and ANALYZE _struct right before the UPDATE didn't
> help either. (Also shouldn't the creation of an index on tmp (id) take care
> of analyzing that column?)
> Thanks,
> Viktor
> Am 14.07.2008 um 20:52 schrieb Tom Lane:
>
> Viktor Rosenfeld <rosenfel(at)informatik(dot)hu-berlin(dot)de> writes:
>
> the script below runs very fast when executed alone. But when I call
>
> it from within a transaction block it's so slow that I have to abort
>
> it after a while. Specifically the second-to-last UPDATE seems to
>
> take forever within a transaction while it completes in about 3
>
> seconds outside a transaction.
>
> Since the table you're working on was just created in the same
> transaction, there's been no opportunity for autovacuum to run an
> ANALYZE on it; that's probably preventing selection of a good plan.
> Try throwing in an "ANALYZE tmp" after you load the table.
>
> regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-07-16 14:14:40 Re: 8.3.3 regression test on SCO 5.0.7
Previous Message Roberts, Jon 2008-07-16 13:30:57 Re: Default fill factor for tables?