Skip site navigation (1) Skip section navigation (2)

Re: SQL Query Performance - what gives?

From: Karl Denninger <karl(at)denninger(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-sql(at)postgresql(dot)org
Subject: Re: SQL Query Performance - what gives?
Date: 2009-08-18 23:03:44
Message-ID: 4A8B334F.1010801@denninger.net (view raw or flat)
Thread:
Lists: pgsql-performancepgsql-sql
Kevin Grittner wrote:
> Karl Denninger <karl(at)denninger(dot)net> wrote:
>   
>>                ->  Index Scan using forum_name on forum 
>> (cost=0.00..250.63 rows=1 width=271) (actual time=0.013..0.408
>> rows=63 loops=1)
>>                      Filter: (((contrib IS NULL) OR (contrib = '
>> '::text) OR (contrib ~~ '%b%'::text)) AND ((permission & 127) =
>> permission))
>>     
>  
> The biggest issue, as far as I can see, is that it thinks that the
> selection criteria on forum will limit to one row, while it really
> matches 63 rows.
>  
> You might be able to coerce it into a faster plan with something like
> this (untested):
>  
> select *
>   from (select * from post
>           where invisible <> 1
>             and to_tsvector('english', message)
>              @@ to_tsquery('violence')
>        ) p,
>   forum
>   where forum.name = p.forum
>     and (permission & '127') = permission
>     and (contrib is null or contrib = ' ' or contrib like '%b%')
>   order by modified desc
>   limit 100
> ;
>  
> -Kevin
>   

That didn't help.

The FTS alone returns 2,000 records on that table, and does so VERY quickly:

ticker=# explain analyze select count(ordinal) from post, forum where
post.forum=forum.name and invisible <> 1
            and to_tsvector('english', message)
             @@ to_tsquery('violence');
                                                               QUERY
PLAN                                                              
----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=5901.57..5901.58 rows=1 width=4) (actual
time=17.492..17.494 rows=1 loops=1)
   ->  Hash Join  (cost=613.80..5898.04 rows=1408 width=4) (actual
time=1.436..14.620 rows=2056 loops=1)
         Hash Cond: (post.forum = forum.name)
         ->  Bitmap Heap Scan on post  (cost=370.93..5635.71 rows=1435
width=14) (actual time=1.123..7.944 rows=2056 loops=1)
               Recheck Cond: (to_tsvector('english'::text, message) @@
to_tsquery('violence'::text))
               Filter: (invisible <> 1)
               ->  Bitmap Index Scan on idx_message  (cost=0.00..370.57
rows=1435 width=0) (actual time=0.738..0.738 rows=2099 loops=1)
                     Index Cond: (to_tsvector('english'::text, message)
@@ to_tsquery('violence'::text))
         ->  Hash  (cost=242.07..242.07 rows=64 width=9) (actual
time=0.300..0.300 rows=64 loops=1)
               ->  Index Scan using forum_name on forum 
(cost=0.00..242.07 rows=64 width=9) (actual time=0.011..0.182 rows=64
loops=1)
 Total runtime: 17.559 ms
(11 rows)

ticker=#

Ok, but now when we check the permission mask....


ticker=# explain analyze select count(ordinal) from post, forum where
post.forum=forum.name and invisible <> 1
            and to_tsvector('english', message)
             @@ to_tsquery('violence') and (permission & 4 = permission);
                                                                   
QUERY
PLAN                                                                   
--------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1329.07..1329.08 rows=1 width=4) (actual
time=29819.293..29819.295 rows=1 loops=1)
   ->  Nested Loop  (cost=978.97..1329.01 rows=22 width=4) (actual
time=2.575..29815.530 rows=2056 loops=1)
         ->  Index Scan using forum_name on forum  (cost=0.00..242.39
rows=1 width=13) (actual time=0.016..0.355 rows=62 loops=1)
               Filter: ((permission & 4) = permission)
         ->  Bitmap Heap Scan on post  (cost=978.97..1086.28 rows=27
width=14) (actual time=97.997..480.746 rows=33 loops=62)
               Recheck Cond: ((to_tsvector('english'::text,
post.message) @@ to_tsquery('violence'::text)) AND (post.forum =
forum.name))
               Filter: (post.invisible <> 1)
               ->  BitmapAnd  (cost=978.97..978.97 rows=27 width=0)
(actual time=91.106..91.106 rows=0 loops=62)
                     ->  Bitmap Index Scan on idx_message 
(cost=0.00..370.57 rows=1435 width=0) (actual time=0.680..0.680
rows=2099 loops=62)
                           Index Cond: (to_tsvector('english'::text,
post.message) @@ to_tsquery('violence'::text))
                     ->  Bitmap Index Scan on post_forum 
(cost=0.00..607.78 rows=26575 width=0) (actual time=89.927..89.927
rows=22980 loops=62)
                           Index Cond: (post.forum = forum.name)
 Total runtime: 29819.376 ms
(13 rows)

ticker=#

The problem appearsa to lie in the "nested loop", and I don't understand
why that's happening.  Isn't a **LINEAR** check on each returned value
(since we do the aggregate first?) sufficient?  Why is the query planner
creating a nested loop - the aggregate contains the tested field and it
is not subject to change once aggregated?!



Attachment: karl.vcf
Description: text/x-vcard (265 bytes)

In response to

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2009-08-19 02:02:18
Subject: Re: [SQL] SQL Query Performance - what gives?
Previous:From: Scott MarloweDate: 2009-08-18 22:34:20
Subject: Re: number of rows estimation for bit-AND operation

pgsql-sql by date

Next:From: Tim LandscheidtDate: 2009-08-18 23:39:14
Subject: Re: FW: simple? query
Previous:From: Kevin GrittnerDate: 2009-08-18 21:59:22
Subject: Re: SQL Query Performance - what gives?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group