Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno
Date: 2019-12-20 16:12:53
Message-ID: 26144.1576858373@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I started to think a little harder about the rough ideas I sketched
> yesterday in [1] about making the planner deal with outer joins in
> a less ad-hoc manner.
> [1] https://www.postgresql.org/message-id/7771.1576452845%40sss.pgh.pa.us

After further study, the idea of using join alias vars to track
outer-join semantics basically doesn't work at all. Join alias vars in
their current form (ie, references to the output columns of a JOIN RTE)
aren't suitable for the purpose of representing possibly-nulled inputs
to that same RTE. There are two big stumbling blocks:

* In the presence of JOIN USING, we don't necessarily have a JOIN output
column that is equivalent to either input column. The output is
definitely not equal to the nullable side of an OJ, since it won't go to
NULL; and it might not be equivalent to the non-nullable side either,
because JOIN USING might've coerced it to some common datatype.

* We also don't have any output column that could represent a whole-row
reference to either input table. I thought about representing that with
a RowExpr of join output Vars, but that fails to preserve the existing
semantics: a whole-row reference to the nullable side goes to NULL, not
to a row of NULLs, when we're null-extending the join.

So that kind of crashed and burned. We could maybe fake it by inventing
some new conventions about magic attnums of the join RTE that correspond
to the values we want, but that seems really messy and error-prone.

The alternatives that seem plausible at this point are

(1) Create some sort of wrapper node indicating "the contents of this
expression might be replaced by NULL". This is basically what the
planner's PlaceHolderVars do, so maybe we'd just be talking about
introducing those at some earlier stage.

(2) Explicitly mark Vars as being nullable by some outer join. I think
we could probably get this down to one additional integer field in
struct Var, so it wouldn't be too much of a penalty.

The wrapper approach is more general since you can wrap something
that's not necessarily a plain Var; but it's also bulkier and so
probably a bit less efficient. I'm not sure which idea I like better.

With either approach, we could either make parse analysis inject the
nullability markings, or wait to do it in the planner. On a purely
abstract system structural level, I like the former better: it is
exactly the province of parse analysis to decide what are the semantics
of what the user typed, and surely what a Var means is part of that.
OTOH, if we do it that way, the planner potentially has to rearrange the
markings after it does join strength reduction; so maybe it's best to
just wait till after that planning phase to address this at all.

Any thoughts about that?

Anyway, I had started to work on getting parse analysis to label
outer-join-nullable Vars properly, and soon decided that no matter how
we do it, there's not enough information available at the point where
parse analysis makes a Var. The referenced RTE is not, in itself,
enough info, and I don't think we want to decorate RTEs with more info
that's only needed during parse analysis. What would be saner is to add
any extra info to the ParseNamespaceItem structs. But that requires
some refactoring to allow the ParseNamespaceItems, not just the
referenced RTEs, to be passed down through Var lookup/construction.
So attached is a patch that refactors things that way. As proof of
concept, I added the rangetable index to ParseNamespaceItem, and used
that to get rid of the RTERangeTablePosn() searches that we formerly had
in a bunch of places. Now, RTERangeTablePosn() isn't likely to be all
that expensive, but still this should be a little faster and cleaner.
Also, I was able to confine the fuzzy-lookup heuristic stuff to within
parse_relation.c instead of letting it bleed out to the rest of the
parser.

This seems to me to be good cleanup regardless of whether we ever
ask parse analysis to label outer-join-nullable Vars. So, barring
objection, I'd like to push it soon.

regards, tom lane

Attachment Content-Type Size
clean-up-parser-Var-creation-1.patch text/x-diff 47.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-12-20 17:11:31 Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno
Previous Message Mark Lorenz 2019-12-20 16:02:01 Re: Created feature for to_date() conversion using patterns 'YYYY-WW', 'YYYY-WW-D', 'YYYY-MM-W' and 'YYYY-MM-W-D'