Re: Reduce "Var IS [NOT] NULL" quals during constant folding

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tender Wang <tndrwang(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reduce "Var IS [NOT] NULL" quals during constant folding
Date: 2025-07-03 00:30:07
Message-ID: CAMbWs48LSd-G6LacDuVkw9eupRny=3rgNmAk+v40sLmJFfW-vQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 2, 2025 at 6:44 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 2/7/2025 11:14, Richard Guo wrote:
> > On Wed, Jul 2, 2025 at 4:32 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> >> Therefore, it would be better to find a way to refactor the
> >> `preprocess_relation_rtes` function to gather table statistics lazily
> >> into the hash table when they are needed. For example, we could do this
> >> at the moment of creating the `RelOptInfo` or before a subquery pull-up,
> >> without modifying the RTE at all.

> > All the catalog information collected in preprocess_relation_rtes() is
> > needed very early in the planner. I don't see how we could move that
> > logic to a later stage, such as at the moment of creating RelOptInfos
> > as you mentioned.

> I apologise for the confusion in my previous message. I am not
> suggesting that we postpone this. Instead, I would like an explanation
> of why you believe that accessing the table statistics earlier could
> negatively impact planner performance. As I mentioned before, I have
> only envisioned rare instances where join eliminations may reduce the
> number of relations and clause evaluations resulting in a constant.

I wonder how you arrived at the conclusion that these cases are rare.
If they truly are, then why have we invested so much effort in
optimizing for them?

I also wonder why you think we should collect all catalog information
at the very early stage of the planner, given that most of it is only
used much later -- after RelOptInfos have been created. If the goal
is to avoid redundant catalog retrieval for the same relation in
get_relation_info(), perhaps adding a caching mechanism within that
function would be a more targeted solution. I don't see a strong
reason for moving get_relation_info() to the very beginning of the
planner.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-07-03 00:48:40 Re: A assert failure when initdb with track_commit_timestamp=on
Previous Message Michael Paquier 2025-07-03 00:28:25 Re: A assert failure when initdb with track_commit_timestamp=on