Re: POC, WIP: OR-clause support for indexes

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2023-11-06 13:51:45
Message-ID: 16a7e3aa-24c0-4986-8820-ea2857bd7b6b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30.10.2023 17:06, Alexander Korotkov wrote:
> On Mon, Oct 30, 2023 at 3:40 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Oct 26, 2023 at 5:05 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>>> On Thu, Oct 26, 2023 at 12:59 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>> Alexander's example seems to show that it's not that simple. If I'm
>>>> reading his example correctly, with things like aid = 1, the
>>>> transformation usually wins even if the number of things in the OR
>>>> expression is large, but with things like aid + 1 * bid = 1, the
>>>> transformation seems to lose at least with larger numbers of items. So
>>>> it's not JUST the number of OR elements but also what they contain,
>>>> unless I'm misunderstanding his point.
>>> Alexander said "Generally, I don't see why ANY could be executed
>>> slower than the equivalent OR clause". I understood that this was his
>>> way of expressing the following idea:
>>>
>>> "In principle, there is no reason to expect execution of ANY() to be
>>> slower than execution of an equivalent OR clause (except for
>>> noise-level differences). While it might not actually look that way
>>> for every single type of plan you can imagine right now, that doesn't
>>> argue for making a cost-based decision. It actually argues for fixing
>>> the underlying issue, which can't possibly be due to some kind of
>>> fundamental advantage enjoyed by expression evaluation with ORs".
>>>
>>> This is also what I think of all this.
>> I agree with that, with some caveats, mainly that the reverse is to
>> some extent also true. Maybe not completely, because arguably the
>> ANY() formulation should just be straight-up easier to deal with, but
>> in principle, the two are equivalent and it shouldn't matter which
>> representation we pick.
>>
>> But practically, it may, and we need to be sure that we don't put in
>> place a translation that is theoretically a win but in practice leads
>> to large regressions. Avoiding regressions here is more important than
>> capturing all the possible gains. A patch that wins in some scenarios
>> and does nothing in others can be committed; a patch that wins in even
>> more scenarios but causes serious regressions in some cases probably
>> can't.
> +1
> Sure, I've identified two cases where patch shows regression [1]. The
> first one (quadratic complexity of expression processing) should be
> already addressed by usage of hash. The second one (planning
> regression with Bitmap OR) is not yet addressed.
>
> Links
> 1. https://www.postgresql.org/message-id/CAPpHfduJtO0s9E%3DSHUTzrCD88BH0eik0UNog1_q3XBF2wLmH6g%40mail.gmail.com
>
I also support this approach. I have almost finished writing a patch
that fixes the first problem related to the quadratic complexity of
processing expressions by adding a hash table.

I also added a check: if the number of groups is equal to the number of
OR expressions, we assume that no expressions need to be converted and
interrupt further execution.

Now I am trying to fix the last problem in this patch: three tests have
indicated a problem related to incorrect conversion. I don't think it
can be serious, but I haven't figured out where the mistake is yet.

I added log like that: ERROR:  unrecognized node type: 0.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
v8.1-Replace-OR-clause-to-ANY-expressions.patch text/x-patch 35.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2023-11-06 14:00:57 Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Previous Message Pavel Borisov 2023-11-06 13:42:38 Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)