From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Richard Guo <guofenglinux(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "bruce(at)momjian(dot)us" <bruce(at)momjian(dot)us>, lepihov(at)gmail(dot)com |
Subject: | Re: plan shape work |
Date: | 2025-09-12 15:08:51 |
Message-ID: | 2129114.1757689731@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Sep 11, 2025 at 2:19 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> If we're going to attach more labeling to the plan nodes, I'd
>> prefer to do what I suggested and label the nodes with the specific
>> outer join that they think they are implementing. With Richard's
>> proposal it will remain impossible to tell which node is doing what.
> Conceptually, I prefer your idea of one RTI per join node, but I don't
> understand how to make it work. Let's say that, as in Richard's
> example, the query is written as (A leftjoin B on (Pab)) leftjoin C on
> (Pbc) but we end up with a plan tree that looks like this:
> Something Join (RTIs: 1 2 3 4 5)
> -> Scan on A (RTI: 1)
> -> Whatever Join (RTIs: 2 4)
> -> Scan on B (RTI: 2)
> -> Scan on C (RTI: 4)
After thinking about this for awhile, I believe that Richard and I
each had half of the right solution ;-). Let me propose some new
terminology in hopes of clarifying matters:
* A join plan node "starts" an outer join if it performs the
null-extension step corresponding to that OJ (specifically,
if it is the first join doing null-extension over the minimum
RHS of that OJ).
* A join plan node "completes" an outer join if its output
nulls all the values that that OJ should null when done
according to syntactic order.
In simple cases where we have not applied OJ identity 3, every
outer-join plan node starts and completes a single OJ relid.
But if we have applied identity 3 in the forward direction,
as per your example above, it's different. The physically
lower join node starts OJ 5, but doesn't complete it. The
upper node starts OJ 3, and completes both 3 and 5. I think
that it's possible for the topmost join to complete more than
two OJs, if we have a nest of multiple OJs that can all be
re-ordered via identity 3.
I was arguing for labeling plan nodes according to which OJ they
start (always a unique relid). Richard was arguing for labeling
according to which OJ(s) they complete (zero, one, or more relids).
But I now think it's probably worth doing both. We need the
completion bitmapsets if we want to cross-check Var nullingrels,
because those correspond to the nullingrels that should get added
at each join's output. I think that we also want the start labels
though. For one thing, if the start nodes are not identified,
it's impossible to understand how much of the tree is the "no
man's land" where a C variable may or may not have gone to null
on its way to becoming a C* variable. But in general I think
that we'll want to be able to identify an outer-join plan node
even if it does not complete its OJ.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2025-09-12 15:37:32 | Re: ABI Compliance Checker GSoC Project |
Previous Message | David E. Wheeler | 2025-09-12 14:52:38 | Re: ABI Compliance Checker GSoC Project |