Re: POC, WIP: OR-clause support for indexes

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2023-10-30 14:06:41
Message-ID: CAPpHfdt58O5jzKMjkScNbGecM5JqfaMfdGLjs4rk5XisfjTNBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 30, 2023 at 3:40 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Oct 26, 2023 at 5:05 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Thu, Oct 26, 2023 at 12:59 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > Alexander's example seems to show that it's not that simple. If I'm
> > > reading his example correctly, with things like aid = 1, the
> > > transformation usually wins even if the number of things in the OR
> > > expression is large, but with things like aid + 1 * bid = 1, the
> > > transformation seems to lose at least with larger numbers of items. So
> > > it's not JUST the number of OR elements but also what they contain,
> > > unless I'm misunderstanding his point.
> >
> > Alexander said "Generally, I don't see why ANY could be executed
> > slower than the equivalent OR clause". I understood that this was his
> > way of expressing the following idea:
> >
> > "In principle, there is no reason to expect execution of ANY() to be
> > slower than execution of an equivalent OR clause (except for
> > noise-level differences). While it might not actually look that way
> > for every single type of plan you can imagine right now, that doesn't
> > argue for making a cost-based decision. It actually argues for fixing
> > the underlying issue, which can't possibly be due to some kind of
> > fundamental advantage enjoyed by expression evaluation with ORs".
> >
> > This is also what I think of all this.
>
> I agree with that, with some caveats, mainly that the reverse is to
> some extent also true. Maybe not completely, because arguably the
> ANY() formulation should just be straight-up easier to deal with, but
> in principle, the two are equivalent and it shouldn't matter which
> representation we pick.
>
> But practically, it may, and we need to be sure that we don't put in
> place a translation that is theoretically a win but in practice leads
> to large regressions. Avoiding regressions here is more important than
> capturing all the possible gains. A patch that wins in some scenarios
> and does nothing in others can be committed; a patch that wins in even
> more scenarios but causes serious regressions in some cases probably
> can't.

+1
Sure, I've identified two cases where patch shows regression [1]. The
first one (quadratic complexity of expression processing) should be
already addressed by usage of hash. The second one (planning
regression with Bitmap OR) is not yet addressed.

Links
1. https://www.postgresql.org/message-id/CAPpHfduJtO0s9E%3DSHUTzrCD88BH0eik0UNog1_q3XBF2wLmH6g%40mail.gmail.com

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-10-30 14:12:46 Re: Explicitly skip TAP tests under Meson if disabled
Previous Message Bruce Momjian 2023-10-30 14:01:14 Re: Why is DEFAULT_FDW_TUPLE_COST so insanely low?