Re: PG using index+filter instead only use index

From: Alexandre de Arruda Paes <adaldeia(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: PG using index+filter instead only use index
Date: 2010-03-19 21:04:46
Message-ID: fe86db181003191404i68a5521cs5976887abd51faa1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Tom,

2010/3/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Alexandre de Arruda Paes <adaldeia(at)gmail(dot)com> writes:
>> My question: if the cost is exactly the same, why PG choose the index
>> ict13t2 on ct13t and apply a filter instead use the primary key ?
>
> Why shouldn't it, if the estimated costs are the same?  You didn't
> actually demonstrate they're the same though.
>
> The cost estimates look a bit unusual to me; are you using nondefault
> cost parameters, and if so what are they?
>
>                        regards, tom lane
>

The non default value in cost parameters is different only in
random_page_cost that are set to 2.5 and default_statistics_target set
to 300.
I set this parameters to defaults (4 and 100) and re-analyze the
tables but results are the same.

Some more info on another table with the same behavior (ANALYZE ok in
all tables):

client=# \d ct14t
Table "public.ct14t"
Column | Type | Modifiers
------------+---------------+-----------
ct14emp04 | integer | not null
ct03emp01 | integer | not null
ct03tradut | integer | not null
ct07emp01 | integer | not null
ct07c_cust | integer | not null
ct14ano | integer | not null
ct14mes | integer | not null
ct14debito | numeric(14,2) |
ct14credit | numeric(14,2) |
ct14orcado | numeric(14,2) |
Indexes:
"ct14t_pkey" PRIMARY KEY, btree (ct14emp04, ct03emp01, ct03tradut,
ct07emp01, ct07c_cust, ct14ano, ct14mes) CLUSTER
"ad_ict14t" btree (ct14emp04, ct03emp01, ct03tradut, ct07emp01,
ct07c_cust, ct14ano, ct14mes) WHERE ct14emp04 = 2 AND ct03emp01 = 2
AND ct07emp01 = 2
"ict14t1" btree (ct07emp01, ct07c_cust)
"ict14t2" btree (ct03emp01, ct03tradut)

client=# select ct07c_cust,count(*) from ct14t group by ct07c_cust
order by count(*) DESC;
ct07c_cust | count
------------+-------
0 | 55536
99 | 14901
107 | 3094
800 | 1938
(...)

If I use any different value from '0' in the ct07c_cust field, the
planner choose the 'right' index:

client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM ad_CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut
= '14930' AND ct07emp01 = '2' AND ct07c_cust = '99' AND ct14ano =
'2003' AND ct14mes = '4';

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using ad_ict14t_1 on ad_ct14t (cost=0.00..5.28 rows=1
width=42) (actual time=5.504..5.504 rows=0 loops=1)
Index Cond: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
14930) AND (ct07emp01 = 2) AND (ct07c_cust = 99) AND (ct14ano = 2003)
AND (ct14mes = 4))
Total runtime: 5.548 ms
(3 rows)

With '0' in the ct07c_cust field, they choose a more slow way:

client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut =
'57393' AND ct07emp01 = '2' AND ct07c_cust = '0' AND ct14ano = '2002'
AND ct14mes = '5';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Index Scan using ict14t1 on ct14t (cost=0.00..5.32 rows=1 width=42)
(actual time=211.007..211.007 rows=0 loops=1)
Index Cond: ((ct07emp01 = 2) AND (ct07c_cust = 0))
Filter: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
57393) AND (ct14ano = 2002) AND (ct14mes = 5))
Total runtime: 211.062 ms
(4 rows)

Again, if I create a table for test from this table (AD_CT14T) and
only create the index used in the first query plan, the results are ok
(ct07c_cust=0 / same query above):

client=# create table ad_ct14t as select * from ct14t;
SELECT
client=# create index ad_ict14t_abc on ad_ct14t(ct14emp04, ct03emp01,
ct03tradut, ct07emp01, ct07c_cust, ct14ano, ct14mes) where ct14emp04 =
'2' AND ct03emp01 = '2' AND ct07emp01 = '2';
CREATE
client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM AD_CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut
= '57393' AND ct07emp01 = '2' AND ct07c_cust = '0' AND ct14ano =
'2002' AND ct14mes = '5';

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using ad_ict14t_abc on ad_ct14t (cost=0.00..5.28 rows=1
width=42) (actual time=0.043..0.043 rows=0 loops=1)
Index Cond: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
57393) AND (ct07emp01 = 2) AND (ct07c_cust = 0) AND (ct14ano = 2002)
AND (ct14mes = 5))
Total runtime: 0.091 ms
(3 rows)

I don't know why the planner prefer to use a less specific index
(ict14t1) and do a filter than use an index that matches with the
WHERE parameter...

Best regards,

Alexandre

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kenneth Marshall 2010-03-19 21:16:38 Re: GiST index performance
Previous Message Yeb Havinga 2010-03-19 20:49:30 Re: GiST index performance