Re: POC, WIP: OR-clause support for indexes

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-03-13 11:05:41
Message-ID: CAPpHfdsADL+agHnnjvi_42hNJWQdHK76SVPbNjsqAyHzxOxv8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 13, 2024 at 7:52 AM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> On 12/3/2024 22:20, Alexander Korotkov wrote:
> > On Mon, Mar 11, 2024 at 2:43 PM Andrei Lepikhov
> >> I think you are right. It is probably a better place than any other to
> >> remove duplicates in an array. I just think we should sort and remove
> >> duplicates from entry->consts in one pass. Thus, this optimisation
> >> should be applied to sortable constants.
> >
> > Ok.
> New version of the patch set implemented all we have agreed on for now.
> We can return MAX_SAOP_ARRAY_SIZE constraint and Alena's approach to
> duplicates deletion for non-sortable cases at the end.
> >
> >> Hmm, we already tried to do it at that point. I vaguely recall some
> >> issues caused by this approach. Anyway, it should be done as quickly as
> >> possible to increase the effect of the optimization.
> >
> > I think there were provided quite strong reasons why this shouldn't be
> > implemented at the parse analysis stage [1], [2], [3]. The
> > canonicalize_qual() looks quite appropriate place for that since it
> > does similar transformations.
> Ok. Let's discuss these reasons. In Robert's opinion [1,3], we should do
> the transformation based on the cost model. But in the canonicalize_qual
> routine, we still make the transformation blindly. Moreover, the second
> patch reduces the weight of this reason, doesn't it? Maybe we shouldn't
> think about that as about optimisation but some 'general form of
> expression'?
> Peter [2] worries about the possible transformation outcomes at this
> stage. But remember, we already transform clauses like ROW() IN (...) to
> a series of ORs here, so it is not an issue. Am I wrong?
> Why did we discard the attempt with canonicalize_qual on the previous
> iteration? - The stage of parsing is much more native for building SAOP
> quals. We can reuse make_scalar_array_op and other stuff, for example.
> During the optimisation stage, the only list partitioning machinery
> creates SAOP based on a list of constants. So, in theory, it is possible
> to implement. But do we really need to make the code more complex?

As we currently do OR-to-ANY transformation at the parse stage, the
system catalog (including views, inheritance clauses, partial and
expression indexes, and others) would have a form depending on
enable_or_transformation at the moment of DDL execution. I think this
is rather wrong. The enable_or_transformation should be run-time
optimization which affects the resulting query plan, its result
shouldn't be persistent.

Regarding the ROW() IN (...) precedent.

1. AFAICS, this is not exactly an optimization. This transformation
allows us to perform type matching individually for every value.
Therefore it allows the execute some queries which otherwise would end
up with error.
2. I don't think this is a sample of good design. This is rather
hack, which is historically here, but we don't want to replicate this
experience.

Given all of the above, I think moving transformation to the
canonicalize_qual() would be the right way to go.

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-13 11:28:09 Re: Have pg_basebackup write "dbname" in "primary_conninfo"?
Previous Message John Naylor 2024-03-13 11:04:51 Re: [PoC] Improve dead tuple storage for lazy vacuum