Re: SQL Query Performance - what gives?

From: Karl Denninger <karl(at)denninger(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-sql(at)postgresql(dot)org
Subject: Re: SQL Query Performance - what gives?
Date: 2009-08-18 23:03:44
Message-ID: 4A8B334F.1010801@denninger.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance pgsql-sql

Kevin Grittner wrote:
> Karl Denninger <karl(at)denninger(dot)net> wrote:
>
>> -> Index Scan using forum_name on forum
>> (cost=0.00..250.63 rows=1 width=271) (actual time=0.013..0.408
>> rows=63 loops=1)
>> Filter: (((contrib IS NULL) OR (contrib = '
>> '::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) =
>> permission))
>>
>
> The biggest issue, as far as I can see, is that it thinks that the
> selection criteria on forum will limit to one row, while it really
> matches 63 rows.
>
> You might be able to coerce it into a faster plan with something like
> this (untested):
>
> select *
> from (select * from post
> where invisible <> 1
> and to_tsvector('english', message)
> @@ to_tsquery('violence')
> ) p,
> forum
> where forum.name = p.forum
> and (permission & '127') = permission
> and (contrib is null or contrib = ' ' or contrib like '%b%')
> order by modified desc
> limit 100
> ;
>
> -Kevin
>

That didn't help.

The FTS alone returns 2,000 records on that table, and does so VERY quickly:

ticker=# explain analyze select count(ordinal) from post, forum where
post.forum=forum.name and invisible <> 1
and to_tsvector('english', message)
@@ to_tsquery('violence');
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5901.57..5901.58 rows=1 width=4) (actual
time=17.492..17.494 rows=1 loops=1)
-> Hash Join (cost=613.80..5898.04 rows=1408 width=4) (actual
time=1.436..14.620 rows=2056 loops=1)
Hash Cond: (post.forum = forum.name)
-> Bitmap Heap Scan on post (cost=370.93..5635.71 rows=1435
width=14) (actual time=1.123..7.944 rows=2056 loops=1)
Recheck Cond: (to_tsvector('english'::text, message) @@
to_tsquery('violence'::text))
Filter: (invisible <> 1)
-> Bitmap Index Scan on idx_message (cost=0.00..370.57
rows=1435 width=0) (actual time=0.738..0.738 rows=2099 loops=1)
Index Cond: (to_tsvector('english'::text, message)
@@ to_tsquery('violence'::text))
-> Hash (cost=242.07..242.07 rows=64 width=9) (actual
time=0.300..0.300 rows=64 loops=1)
-> Index Scan using forum_name on forum
(cost=0.00..242.07 rows=64 width=9) (actual time=0.011..0.182 rows=64
loops=1)
Total runtime: 17.559 ms
(11 rows)

ticker=#

Ok, but now when we check the permission mask....

ticker=# explain analyze select count(ordinal) from post, forum where
post.forum=forum.name and invisible <> 1
and to_tsvector('english', message)
@@ to_tsquery('violence') and (permission & 4 = permission);

QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1329.07..1329.08 rows=1 width=4) (actual
time=29819.293..29819.295 rows=1 loops=1)
-> Nested Loop (cost=978.97..1329.01 rows=22 width=4) (actual
time=2.575..29815.530 rows=2056 loops=1)
-> Index Scan using forum_name on forum (cost=0.00..242.39
rows=1 width=13) (actual time=0.016..0.355 rows=62 loops=1)
Filter: ((permission & 4) = permission)
-> Bitmap Heap Scan on post (cost=978.97..1086.28 rows=27
width=14) (actual time=97.997..480.746 rows=33 loops=62)
Recheck Cond: ((to_tsvector('english'::text,
post.message) @@ to_tsquery('violence'::text)) AND (post.forum =
forum.name))
Filter: (post.invisible <> 1)
-> BitmapAnd (cost=978.97..978.97 rows=27 width=0)
(actual time=91.106..91.106 rows=0 loops=62)
-> Bitmap Index Scan on idx_message
(cost=0.00..370.57 rows=1435 width=0) (actual time=0.680..0.680
rows=2099 loops=62)
Index Cond: (to_tsvector('english'::text,
post.message) @@ to_tsquery('violence'::text))
-> Bitmap Index Scan on post_forum
(cost=0.00..607.78 rows=26575 width=0) (actual time=89.927..89.927
rows=22980 loops=62)
Index Cond: (post.forum = forum.name)
Total runtime: 29819.376 ms
(13 rows)

ticker=#

The problem appearsa to lie in the "nested loop", and I don't understand
why that's happening. Isn't a **LINEAR** check on each returned value
(since we do the aggregate first?) sufficient? Why is the query planner
creating a nested loop - the aggregate contains the tested field and it
is not subject to change once aggregated?!

Attachment Content-Type Size
karl.vcf text/x-vcard 265 bytes

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-08-19 02:02:18 Re: [SQL] SQL Query Performance - what gives?
Previous Message Scott Marlowe 2009-08-18 22:34:20 Re: number of rows estimation for bit-AND operation

Browse pgsql-sql by date

  From Date Subject
Next Message Tim Landscheidt 2009-08-18 23:39:14 Re: FW: simple? query
Previous Message Kevin Grittner 2009-08-18 21:59:22 Re: SQL Query Performance - what gives?