Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?

From: Karl Denninger <karl(at)denninger(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Bryce Nesbitt <bryce2(at)obviously(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Date: 2010-02-12 16:43:10
Message-ID: 4B75851E.6070909@denninger.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Yes:

select forum, * from post where
marked is not true
and toppost = 1
and (select login from ignore_thread where login='xxx' and
number=post.number) is null
and (replied > now() - '30 days'::interval)
and (replied > (select lastview from forumlog where login='xxx' and
forum=post.forum and number is null)) is not false
and (replied > (select lastview from forumlog where login='xxx' and
forum=post.forum and number=post.number)) is not false
and ((post.forum = (select who from excludenew where who='xxx' and
forum_name = post.forum)) or (select who from excludenew where who='xxx'
and forum_name = post.forum) is null)
order by pinned desc, replied desc offset 0 limit 100

Returns the following query plan:
QUERY
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5301575.63..5301575.63 rows=1 width=433) (actual
time=771.000..771.455 rows=100 loops=1)
-> Sort (cost=5301575.63..5301575.63 rows=1 width=433) (actual
time=770.996..771.141 rows=100 loops=1)
Sort Key: post.pinned, post.replied
Sort Method: top-N heapsort Memory: 120kB
-> Index Scan using post_top on post (cost=0.00..5301575.62
rows=1 width=433) (actual time=0.664..752.994 rows=3905 loops=1)
Index Cond: (toppost = 1)
Filter: ((marked IS NOT TRUE) AND (replied > (now() - '30
days'::interval)) AND ((SubPlan 1) IS NULL) AND ((replied > (SubPlan 2))
IS NOT FALSE) AND ((replied > (SubPlan 3)) IS NOT FALSE) AND ((forum =
(SubPlan 4)) OR ((SubPlan 5) IS NULL)))
SubPlan 1
-> Seq Scan on ignore_thread (cost=0.00..5.45 rows=1
width=7) (actual time=0.037..0.037 rows=0 loops=3905)
Filter: ((login = 'xxx'::text) AND (number = $0))
SubPlan 2
-> Index Scan using forumlog_composite on forumlog
(cost=0.00..9.50 rows=1 width=8) (actual time=0.008..0.008 rows=0
loops=3905)
Index Cond: ((login = 'xxx'::text) AND (forum =
$1) AND (number IS NULL))
SubPlan 3
-> Index Scan using forumlog_composite on forumlog
(cost=0.00..9.50 rows=1 width=8) (actual time=0.006..0.006 rows=0
loops=3905)
Index Cond: ((login = 'xxx'::text) AND (forum =
$1) AND (number = $0))
SubPlan 4
-> Index Scan using excludenew_pkey on excludenew
(cost=0.00..8.27 rows=1 width=9) (actual time=0.004..0.004 rows=0
loops=3905)
Index Cond: ((who = 'xxx'::text) AND (forum_name
= $1))
SubPlan 5
-> Index Scan using excludenew_pkey on excludenew
(cost=0.00..8.27 rows=1 width=9) (actual time=0.004..0.004 rows=0
loops=3905)
Index Cond: ((who = 'xxx'::text) AND (forum_name
= $1))
Total runtime: 771.907 ms
(23 rows)

The alternative:

select forum, * from post where
marked is not true
and toppost = 1
and not exists (select login from ignore_thread where login='xxx'
and number=post.number)
and (replied > now() - '30 days'::interval)
and (replied > (select lastview from forumlog where login='xxx' and
forum=post.forum and number is null)) is not false
and (replied > (select lastview from forumlog where login='xxx' and
forum=post.forum and number=post.number)) is not false
and ((post.forum = (select who from excludenew where who='xxx' and
forum_name = post.forum)) or (select who from excludenew where who='xxx'
and forum_name = post.forum) is null)
order by pinned desc, replied desc offset 0 limit 100

goes nuts.

(Yes, I know, most of those others which are "not false" could be
"Exists" too)

Explain Analyze on the alternative CLAIMS the same query planner time
(within a few milliseconds) with explain analyze. But if I replace the
executing code with one that has the alternative ("not exists") syntax
in it, the system load goes to crap instantly and the execution times
"in the wild" go bananas.

I don't know why it does - I just know THAT it does. When I first added
that top clause in there (to allow people an individual "ignore thread"
list) the system load went bananas immediately and forced me to back it
out. When I re-wrote the query as above the performance was (and
remains) fine.

I'm running 8.4.2.

I agree (in advance) it shouldn't trash performance - all I know is that
it does and forced me to re-write the query.

Kevin Grittner wrote:
> Karl Denninger <karl(at)denninger(dot)net> wrote:
> Kevin Grittner wrote:
>
>
>>> I suspect that the above might do pretty well in 8.4.
>>>
>
>
>> "Exists" can be quite slow. So can "not exists"
>>
>> See if you can re-write it using a sub-select - just replace the
>> "exists ...." with "(select ...) is not null"
>>
>> Surprisingly this often results in a MUCH better query plan under
>> Postgresql. Why the planner evaluates it "better" eludes me (it
>> shouldn't) but the differences are often STRIKING - I've seen
>> factor-of-10 differences in execution performance.
>>
>
> Have you seen such a difference under 8.4? Can you provide a
> self-contained example?
>
> -Kevin
>
>

Attachment Content-Type Size
karl.vcf text/x-vcard 124 bytes

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2010-02-12 16:54:27 Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Previous Message Kevin Grittner 2010-02-12 16:11:00 Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?