Re: BUG #6278: Index scans on '>' condition on field with many NULLS

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Maksym Boguk <maxim(dot)boguk(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #6278: Index scans on '>' condition on field with many NULLS
Date: 2011-10-31 16:43:48
Message-ID: 28958.1320079428@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> I poked at this a bit last night. The reason it's happening is that the
> ">" key is only marked SK_BT_REQBKWD, not SK_BT_REQFWD, so _bt_checkkeys
> doesn't think it can stop when it hits the NULLs. Right at the moment
> it seems like we could mark that key with both flags, which leads to the
> conclusion that two flags are unnecessary and we could get by with only
> one direction-independent flag.

OK, I swapped some of this knowledge back in. The above idea would work
in simple cases with non-redundant scan keys, but in general not.
Consider an indexscan on x with

WHERE x > 4

We'll start scanning at the x>4 boundary point and run forward. The
scankey condition can only fail once we reach NULL index entries, at
which point we can stop, so in this case marking this key SK_BT_REQFWD
would be OK. However, consider something like

WHERE x > 4 AND x > 10

where _bt_preprocess_keys is unable to determine which key is more
restrictive. (This could only happen if the comparison constants are of
different types and we don't have a suitable cross-type comparison
operator in the opfamily. None of the standard opfamilies have such
omissions, which is why there's no regression test covering this.)
In such a case, _bt_first arbitrarily picks one of the keys to do the
initial positioning with. If it picks x>4, then the x>10 scankey will
fail until we reach index entries > 10 ... but we can't abandon the
scan just because x>10 is failing. This is why we have to have
direction-sensitive required-key flags. (If we were to start backing
up, failure of either of the redundant keys would indeed be
justification for stopping.)

We could possibly do something with marking only non-redundant >/>=
scankeys as SK_BT_REQFWD, and conversely for </<= keys. But this
looks like it'd be a pain to manage in _bt_preprocess_keys, and given
the difficulty of testing cases where it matters, I'm not excited about
going that direction.

What I'm currently thinking is that we could hack _bt_checkkeys to
stop the scan when it reaches NULLs if the current scankey is not an
ISNULL test and is marked *either* SK_BT_REQFWD or SK_BT_REQBKWD.
In such a case we know that the scankey cannot be satisfied by a NULL,
so it will fail until we reach the next range of this index column,
ie move to the next value of the next higher-order column (if any).
And that value can't pass the scankeys, because it must have an equality
constraint, else this key wouldn't be marked required.

BTW, I had thought that Maksym could work around this with something
like
WHERE value > 0.9999 and value < 1.1
or
WHERE value > 0.9999 and value is not null
but it turns out that neither of those work. The second key will be
marked SK_BT_REQFWD, but because _bt_checkkeys stops testing as soon
as any key has failed, it doesn't reach the second key and doesn't
realize that there's a failing SK_BT_REQFWD key available. That's
a little bit annoying, but the only clear fix would be to keep testing
keys after we already know the index entry has failed, and I think
that's probably going to be a net loss. If we fix the NULL case to
accept either flag, then the only case where this will matter is
redundant scan keys, and that's not a case that I think we should
optimize at the cost of pessimizing normal cases.

A patch along these lines should be pretty localized. Has anyone
got an opinion on whether to back-patch or not? This seems like a
performance bug, but given the lack of prior complaints, maybe we
shouldn't take any risks for it.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pavel Golub 2011-10-31 17:30:12 Re: BUG #6279: quoting needed for column name with non ascii chars
Previous Message Robert Haas 2011-10-31 16:26:59 Re: BUG #6279: quoting needed for column name with non ascii chars