Re: POC, WIP: OR-clause support for indexes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, teodor(at)sigaev(dot)ru, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject: Re: POC, WIP: OR-clause support for indexes
Date: 2023-10-25 19:54:12
Message-ID: CA+Tgmob=ebuCHFSw327b55DJzE3JtOuZ5owxob+MgErb4me_Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 14, 2023 at 6:37 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> Regarding the GUC parameter, I don't see we need a limit. It's not
> yet clear whether a small number or a large number of OR clauses are
> more favorable for transformation. I propose to have just a boolean
> enable_or_transformation GUC.

That's a poor solution. So is the GUC patch currently has
(or_transform_limit). What you need is a heuristic that figures out
fairly reliably whether the transformation is going to be better or
worse. Or else, do the whole thing in some other way that is always
same-or-better.

In general, adding GUCs makes sense when the user knows something that
we can't know. For example, shared_buffers makes some sense because,
even if we discovered how much memory the machine has, we can't know
how much of it the user wants to devote to PostgreSQL as opposed to
anything else. And track_io_timing makes sense because we can't know
whether the user wants to pay the price of gathering that additional
data. But GUCs are a poor way of handling cases where the real problem
is that we don't know what code to write. In this case, some queries
will be better with enable_or_transformation=on, and some will be
better with enable_or_transformation=off. Since we don't know which
will work out better, we make the user figure it out and set the GUC,
possibly differently for each query. That's terrible. It's the query
optimizer's whole job to figure out which transformations will speed
up the query. It shouldn't turn around and punt the decision back to
the user.

Notice that superficially-similar GUCs like enable_seqscan aren't
really the same thing at all. That's just for developer testing and
debugging. Nobody expects that you have to adjust that GUC on a
production system - ever.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-10-25 20:02:01 Re: Remove dead code in pg_ctl.c
Previous Message Tom Lane 2023-10-25 19:43:44 Re: Making aggregate deserialization (and WAL receive) functions slightly faster