Re: PG using index+filter instead only use index

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexandre de Arruda Paes <adaldeia(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PG using index+filter instead only use index
Date: 2010-03-20 02:15:56
Message-ID: 28832.1269051356@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Alexandre de Arruda Paes <adaldeia(at)gmail(dot)com> writes:
> 2010/3/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> The cost estimates look a bit unusual to me; are you using nondefault
>> cost parameters, and if so what are they?

> The non default value in cost parameters is different only in
> random_page_cost that are set to 2.5 and default_statistics_target set
> to 300.

Okay, so with random_page_cost = 2.5, those cost estimates definitely
indicate that it's expecting only one heap tuple to be visited, for
either choice of index.

> I don't know why the planner prefer to use a less specific index
> (ict14t1) and do a filter than use an index that matches with the
> WHERE parameter...

The cost estimate formulas bias the system against using a larger index
when a smaller one will do. That seven-column index is probably at
least twice as large as the two-column index, so it's hardly
unreasonable to assume that scanning it will take more I/O and cache
space and CPU time than using a smaller index, if all else is equal.
Now of course all else is not equal if the smaller index is less
selective than the larger one, but the cost estimates indicate that the
planner thinks the two-column index condition is sufficient to narrow
things down to only one heap tuple anyway.

The fact that the smaller index is actually slower indicates that this
estimate is off, ie (ct07emp01 = 2) AND (ct07c_cust = 0) actually
selects more than one heap tuple. It's hard to speculate about why that
estimate is wrong on the basis of the information you've shown us
though. Perhaps there is a strong correlation between the values of
those two columns?

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Reydan Cankur 2010-03-20 04:50:38 pgbench installation
Previous Message Kenneth Marshall 2010-03-19 21:16:38 Re: GiST index performance