Skip site navigation (1) Skip section navigation (2)

Re: Performs WAY better with enable_seqscan = off

From: Brendan Duddridge <brendan(at)clickspace(dot)com>
To: Ragnar <gnari(at)hive(dot)is>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performs WAY better with enable_seqscan = off
Date: 2006-05-21 20:01:14
Message-ID: BA21CBE9-E89E-4DB6-BA04-FEED87E9B4A0@clickspace.com (view raw or flat)
Thread:
Lists: pgsql-performance
> is there some reason for the complicated form of the
> join conditions in the subselect?


Yes, the simpler form query definitely works, but it's not always as  
fast as the index version with the complicated join syntax. Although  
even that query varies significantly with different category_id  
values. Not sure why. Sometimes it finishes in 150 ms, other times it  
takes over a second.

Here's the explain plan from your query:

explain analyze select ac.attribute_id, la.name, ac.sort_order from  
attribute_category ac, localized_attribute la where ac.category_id =  
1001402 and la.locale_id = 1000001 and ac.is_browsable = 'true' and  
la.attribute_id = ac.attribute_id and exists (select 'x' from  
product_attribute_value pav, category_product cp where pav.product_id  
= cp.product_id and pav.attribute_id = ac.attribute_id and  
pav.status_code is null and cp.category_id= '1001402' and  
cp.is_visible = 'true') order by ac.sort_order, la.name asc;
                                                                         
                            QUERY PLAN
------------------------------------------------------------------------ 
------------------------------------------------------------------------ 
---------------------------------------------------------------
Sort  (cost=6343.18..6343.20 rows=7 width=34) (actual  
time=2244.241..2244.242 rows=2 loops=1)
    Sort Key: ac.sort_order, la.name
    ->  Nested Loop  (cost=2.00..6343.08 rows=7 width=34) (actual  
time=1831.970..2244.209 rows=2 loops=1)
          ->  Index Scan using attribute_category__category_id_fk_idx  
on attribute_category ac  (cost=0.00..6321.95 rows=7 width=8) (actual  
time=1831.938..2244.142 rows=2 loops=1)
                Index Cond: (category_id = 1001402)
                Filter: (((is_browsable)::text = 'true'::text) AND  
(subplan))
                SubPlan
                  ->  Nested Loop  (cost=2.00..10458.04 rows=30  
width=0) (actual time=320.572..320.572 rows=0 loops=7)
                        ->  Index Scan using  
product_attribute_value__attribute_id_fk_idx on  
product_attribute_value pav  (cost=0.00..2661.39 rows=2572 width=4)  
(actual time=0.020..33.589 rows=18468 loops=7)
                              Index Cond: (attribute_id = $0)
                              Filter: (status_code IS NULL)
                        ->  Bitmap Heap Scan on category_product cp   
(cost=2.00..3.02 rows=1 width=4) (actual time=0.011..0.011 rows=0  
loops=129274)
                              Recheck Cond: ("outer".product_id =  
cp.product_id)
                              Filter: ((category_id = 1001402) AND  
((is_visible)::text = 'true'::text))
                              ->  Bitmap Index Scan on  
x_category_product__product_id_fk_idx  (cost=0.00..2.00 rows=1  
width=0) (actual time=0.008..0.008 rows=1 loops=129274)
                                    Index Cond: ("outer".product_id =  
cp.product_id)
          ->  Bitmap Heap Scan on localized_attribute la   
(cost=2.00..3.01 rows=1 width=30) (actual time=0.019..0.019 rows=1  
loops=2)
                Recheck Cond: (la.attribute_id = "outer".attribute_id)
                Filter: (locale_id = 1000001)
                ->  Bitmap Index Scan on  
localized_attribute__attribute_id_fk_idx  (cost=0.00..2.00 rows=1  
width=0) (actual time=0.015..0.015 rows=1 loops=2)
                      Index Cond: (la.attribute_id =  
"outer".attribute_id)
Total runtime: 2244.542 ms


Here's the schema for the two tables involved with the sub-select:

  \d category_product;
              Table "public.category_product"
        Column        |          Type          | Modifiers
---------------------+------------------------+-----------
category_id         | integer                | not null
product_id          | integer                | not null
en_name_sort_order  | integer                |
fr_name_sort_order  | integer                |
merchant_sort_order | integer                |
price_sort_order    | integer                |
merchant_count      | integer                |
is_active           | character varying(5)   |
product_is_active   | character varying(5)   |
product_status_code | character varying(32)  |
product_name_en     | character varying(512) |
product_name_fr     | character varying(512) |
product_click_count | integer                |
is_visible          | character varying(5)   |
is_pending_visible  | character varying(5)   |
min_price_cad       | numeric(12,4)          |
max_price_cad       | numeric(12,4)          |
Indexes:
     "x_category_product_pk" PRIMARY KEY, btree (category_id,  
product_id)
     "category_product__cat_id_is_visible_idx" btree  
(((category_id::text || '.'::text) || is_visible::text))
     "category_product__cat_id_prod_is_act_status_idx" btree  
(category_id, product_is_active, product_status_code)
     "category_product__category_id_is_active_and_status_idx" btree  
(category_id, product_is_active, product_status_code)
     "category_product__is_active_idx" btree (is_active)
     "category_product__lower_product_name_en_idx" btree (lower 
(product_name_en::text))
     "category_product__lower_product_name_fr_idx" btree (lower 
(product_name_fr::text))
     "category_product__merchant_sort_order_idx" btree  
(merchant_sort_order)
     "category_product__min_price_cad_idx" btree (min_price_cad)
     "category_product__product_id_category_id_status_idx" btree  
(product_id, category_id, product_is_active, product_status_code)
     "x_category_product__category_id_fk_idx" btree (category_id)  
CLUSTER
     "x_category_product__product_id_fk_idx" btree (product_id)
Foreign-key constraints:
     "x_category_product_category_fk" FOREIGN KEY (category_id)  
REFERENCES category(category_id) DEFERRABLE INITIALLY DEFERRED
     "x_category_product_product_fk" FOREIGN KEY (product_id)  
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED



and


\d product_attribute_value
              Table "public.product_attribute_value"
            Column           |         Type          | Modifiers
----------------------------+-----------------------+-----------
attribute_id               | integer               | not null
attribute_unit_id          | integer               |
attribute_value_id         | integer               |
boolean_value              | character varying(5)  |
decimal_value              | numeric(30,10)        |
product_attribute_value_id | integer               | not null
product_id                 | integer               | not null
product_reference_id       | integer               |
status_code                | character varying(32) |
Indexes:
     "product_attribute_value_pk" PRIMARY KEY, btree  
(product_attribute_value_id)
     "product_attribute_value__attribute_id_fk_idx" btree (attribute_id)
     "product_attribute_value__attribute_unit_id_fk_idx" btree  
(attribute_unit_id)
     "product_attribute_value__attribute_value_id_fk_idx" btree  
(attribute_value_id)
     "product_attribute_value__normalized_value_idx" btree  
(normalized_value(decimal_value, attribute_unit_id))
     "product_attribute_value__prod_id_att_id_status_is_null_ids"  
btree (((product_id::text || '.'::text) || attribute_id::text)) WHERE  
status_code IS NULL
     "product_attribute_value__prod_id_att_val_id_status_is_null_idx"  
btree (((product_id::text || '.'::text) || attribute_value_id::text))  
WHERE status_code IS NULL
     "product_attribute_value__product_id_fk_idx" btree (product_id)  
CLUSTER
     "product_attribute_value__product_reference_id_fk_idx" btree  
(product_reference_id)
Foreign-key constraints:
     "product_attribute_value_attribute_fk" FOREIGN KEY  
(attribute_id) REFERENCES attribute(attribute_id) DEFERRABLE  
INITIALLY DEFERRED
     "product_attribute_value_attributeunit_fk" FOREIGN KEY  
(attribute_unit_id) REFERENCES attribute_unit(attribute_unit_id)  
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_attributevalue_fk" FOREIGN KEY  
(attribute_value_id) REFERENCES attribute_value(attribute_value_id)  
DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_product_fk" FOREIGN KEY (product_id)  
REFERENCES product(product_id) DEFERRABLE INITIALLY DEFERRED
     "product_attribute_value_productreference_fk" FOREIGN KEY  
(product_reference_id) REFERENCES product(product_id) DEFERRABLE  
INITIALLY DEFERRED



When the query planner uses the indexes with the concatenated values  
and the where clause, the query can be sub-second response times (but  
not always depending on the category_id value). By just doing a  
regular join as you suggested, it's always slower. The trick is  
getting Postgres to use the proper index all the time. And so far the  
only way I can do that is by turning off sequential scans, but that's  
something I didn't want to do because I don't know how it would  
affect the performance of the rest of my application.

Just a note, I have random_page_cost set to 1 to try and get it to  
favour index scans. The database machine has 8GB of RAM and I have  
effective_cache_size set to 2/3 of that.

Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan(at)clickspace(dot)com

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On May 21, 2006, at 4:50 AM, Ragnar wrote:

> On sun, 2006-05-21 at 02:21 -0600, Brendan Duddridge wrote:
>> Hi,
>>
>>
>> I have a query that performs WAY better when I have enable_seqscan =
>> off:
>>
>>
>> explain analyze select ac.attribute_id, la.name, ac.sort_order from
>> attribute_category ac, localized_attribute la where ac.category_id =
>> 1001402 and la.locale_id = 1000001 and ac.is_browsable = 'true' and
>> la.attribute_id = ac.attribute_id and exists ( select 'x' from
>> product_attribute_value pav, category_product cp where  
>> (pav.product_id
>> || '.' || pav.attribute_id) = (cp.product_id || '.' ||
>> ac.attribute_id) and pav.status_code is null and (cp.category_id ||
>> '.' || cp.is_visible) = '1001402.true') order by (ac.sort_order is
>> null), ac.sort_order, la.name asc;
>
> is there some reason for the complicated form of the
> join conditions in the subselect?
>
> would this not be clearer:
>
> explain analyze
>   select ac.attribute_id,
>          la.name,
>          ac.sort_order
>   from attribute_category ac,
>        localized_attribute la
>   where ac.category_id = 1001402
>         and la.locale_id = 1000001
>         and ac.is_browsable = 'true'
>         and la.attribute_id = ac.attribute_id
>         and exists
>           (select 'x' from product_attribute_value pav,
>                            category_product cp
>            where pav.product_id = cp.product_id
>                  and pav.attribute_id = ac.attribute_id
>                  and pav.status_code is null
>                  and cp.category_id= '1001402'
>                  and cp.is_visible = 'true'
>           )
>    order by (ac.sort_order is null),
>              ac.sort_order,
>              la.name asc;
>
>
> possibly the planner would have a better time
> figuring out if any indexes are usable or estimating
> the subselect rowcount
>
> gnari
>
>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>



In response to

Responses

pgsql-performance by date

Next:From: David WheelerDate: 2006-05-21 22:45:14
Subject: Re: Benchmarking Function
Previous:From: Josh BerkusDate: 2006-05-21 19:23:49
Subject: Re: Benchmarking Function

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group