Re: POC, WIP: OR-clause support for indexes

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, jian he <jian(dot)universality(at)gmail(dot)com>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2024-04-08 00:38:36
Message-ID: ZhM8jH8gsKm5Q-9p@pryzbyj2023
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 08, 2024 at 01:34:37AM +0300, Alexander Korotkov wrote:
> Hi!
>
> On Mon, Apr 1, 2024 at 9:38 AM Andrei Lepikhov
> <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> > On 28/3/2024 16:54, Alexander Korotkov wrote:
> > > The current patch has a boolean guc enable_or_transformation.
> > > However, when we have just a few ORs to be transformated, then we
> > > should get less performance gain from the transformation and higher
> > > chances to lose a good bitmap scan plan from that. When there is a
> > > huge list of ORs to be transformed, then the performance gain is
> > > greater and it is less likely we could lose a good bitmap scan plan.
> > >
> > > What do you think about introducing a GUC threshold value: the minimum
> > > size of list to do OR-to-ANY transformation?
> > > min_list_or_transformation or something.
> > I labelled it or_transformation_limit (see in attachment). Feel free to
> > rename it.
> > It's important to note that the limiting GUC doesn't operate
> > symmetrically for forward, OR -> SAOP, and backward SAOP -> OR
> > operations. In the forward case, it functions as you've proposed.
> > However, in the backward case, we only check whether the feature is
> > enabled or not. This is due to our existing limitation,
> > MAX_SAOP_ARRAY_SIZE, and the fact that we can't match the length of the
> > original OR list with the sizes of the resulting SAOPs. For instance, a
> > lengthy OR list with 100 elements can be transformed into 3 SAOPs, each
> > with a size of around 30 elements.
> > One aspect that requires attention is the potential inefficiency of our
> > OR -> ANY transformation when we have a number of elements less than
> > MAX_SAOP_ARRAY_SIZE. This is because we perform a reverse transformation
> > ANY -> OR at the stage of generating bitmap scans. If the BitmapScan
> > path dominates, we may have done unnecessary work. Is this an occurrence
> > that we should address?
> > But the concern above may just be a point of improvement later: We can
> > add one more strategy to the optimizer: testing each array element as an
> > OR clause; we can also provide a BitmapOr path, where SAOP is covered
> > with a minimal number of partial indexes (likewise, previous version).
>
> I've revised the patch. Did some beautification, improvements for
> documentation, commit messages etc.
>
> I've pushed the 0001 patch without 0002. I think 0001 is good by
> itself given that there is the or_to_any_transform_limit GUC option.
> The more similar OR clauses are here the more likely grouping them
> into SOAP will be a win. But I've changed the default value to 5.

The sample config file has the wrong default

+#or_to_any_transform_limit = 0

We had a patch to catch this kind of error, but it was closed (which IMO
was itself an error).

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-08 00:42:12 Re: Popcount optimization using AVX512
Previous Message Andres Freund 2024-04-08 00:10:13 Re: Extension Enhancement: Buffer Invalidation in pg_buffercache