Re: [NOVICE] WHERE clause not used when index is used

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tobias Florek <postgres(at)ibotty(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [NOVICE] WHERE clause not used when index is used
Date: 2016-03-01 18:32:10
Message-ID: 18923.1456857130@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-novice

Petr Jelinek <petr(at)2ndquadrant(dot)com> writes:
> On 01/03/16 18:37, Tom Lane wrote:
>> However, I'm not sure that's 100% of the issue, because in playing around
>> with this I was having a harder time reproducing the failure outside of
>> Tobias' example than I expected. There may be more than one bug, or there
>> may be other changes that sometimes mask the problem.

> I can only get the issue when the sort order of the individual keys does
> not correlate and the operator sorts according to the first column and
> there are duplicate values for the first column.

Yeah, I think the combination of ASC and DESC columns may be necessary to
break things. It needs closer analysis.

There is another behavorial difference between 9.4 and 9.5, which is that
the planner's costing of scans of this sort seems to have changed. I can
reproduce the problem now in the regression database:

regression=# select count(*) from (select * from tenk1 where (thousand,tenthous) < (9,5000) order by thousand desc, tenthous asc) ss;
count
-------
95 -- correct answer
(1 row)

regression=# create index on tenk1(thousand desc,tenthous asc);
CREATE INDEX
regression=# select count(*) from (select * from tenk1 where (thousand,tenthous) < (9,5000) order by thousand desc, tenthous asc) ss;
count
-------
100 -- WRONG
(1 row)

What was confusing me is that the plan's changed: HEAD gives

Aggregate (cost=7.29..7.29 rows=1 width=0)
-> Index Only Scan using tenk1_thousand_tenthous_idx on tenk1 (cost=0.29..6.04 rows=100 width=8)
Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000))

whereas 9.4 prefers

Aggregate (cost=232.50..232.51 rows=1 width=0)
-> Sort (cost=231.00..231.25 rows=100 width=244)
Sort Key: tenk1.thousand, tenk1.tenthous
-> Bitmap Heap Scan on tenk1 (cost=5.06..227.67 rows=100 width=244)
Recheck Cond: (ROW(thousand, tenthous) < ROW(9, 5000))
-> Bitmap Index Scan on tenk1_thousand_tenthous_idx (cost=0.00..5.04 rows=100 width=0)
Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000))

However you can force 9.4 to do it the same as HEAD by setting enable_sort
to zero:

Aggregate (cost=359.27..359.28 rows=1 width=0)
-> Index Scan using tenk1_thousand_tenthous_idx on tenk1 (cost=0.29..358.02 rows=100 width=244)
Index Cond: (ROW(thousand, tenthous) < ROW(9, 5000))

But 9.4 correctly answers "95" with either plan, and 9.5 gives the wrong
answer with either plan, so the plan change is not the cause of the bug.

I'm not sure if the costing change is a bug or not --- the non-bitmap scan
does seem to be cheaper in reality, but not by a couple orders of
magnitude as the planner now thinks.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-01 18:33:02 Re: Fixing wrong comment on PQmblen and PQdsplen.
Previous Message Robert Haas 2016-03-01 18:28:25 Re: RFC: replace pg_stat_activity.waiting with something more descriptive

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2016-03-01 18:47:05 Re: [NOVICE] WHERE clause not used when index is used
Previous Message Petr Jelinek 2016-03-01 18:01:58 Re: [NOVICE] WHERE clause not used when index is used