Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno
Date: 2019-12-16 17:00:43
Message-ID: 15848.1576515643@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I started to think a little harder about the rough ideas I sketched
yesterday in [1] about making the planner deal with outer joins in
a less ad-hoc manner. One thing that emerged very quickly is that
I was misremembering how the parser creates join alias Vars.
Consider for example

create table t1(a int, b int);
create table t2(x int, y int);

select a, t1.a, x, t2.x from t1 left join t2 on b = y;

The Vars that the parser will produce in the SELECT's targetlist have,
respectively,

:varno 3
:varattno 1

:varno 1
:varattno 1

:varno 3
:varattno 3

:varno 2
:varattno 1

(where "3" is the rangetable index of the unnamed join relation).
So as far as the parser is concerned, a "join alias" var is just
one that you named by referencing the join output column; it's
not tracking whether the value is one that's affected by the join
semantics.

What I'd like, in order to make progress with the planner rewrite,
is that all four Vars in the tlist have varno 3, showing that
they are (potentially) semantically distinct from the Vars in
the JOIN ON clause (which'd have varnos 1 and 2 in this example).

This is a pretty small change as far as most of the system is
concerned; there should be noplace that fails to cope with a
join alias Var, since it'd have been legal to write a join
alias Var in anyplace that would change.

However, it's a bit sticky for ruleutils.c, which needs to be
able to regurgitate these Vars in their original spellings.
(This is "needs", not "wants", because there are various
conditions under which we don't have the option of spelling
it either way. For instance, if both tables expose columns
named "z", then you must write "t1.z" or "t2.z"; the columns
won't have unique names at the join level.)

What I'd like to do about that is redefine the existing
varnoold/varoattno fields as being the "syntactic" identifier
of the Var, versus the "semantic" identifier that varno/varattno
would be, and have ruleutils.c always use varnoold/varoattno
when trying to print a Var.

I think that this approach would greatly clarify what those fields
mean and how they should be manipulated --- for example, it makes
it clear that _equalVar() should ignore varnoold/varoattno, since
Vars with the same semantic meaning should be considered equal
even if they were spelled differently.

While at it, I'd be inclined to rename those fields, since the
existing names aren't even consistently spelled, much less meaningful.
Perhaps "varsno/varsattno" or "varnosyn/varattnosyn".

Thoughts?

regards, tom lane

[1] https://www.postgresql.org/message-id/7771.1576452845%40sss.pgh.pa.us

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Zubiri 2019-12-16 17:14:37 Re: Improvement to psql's connection defaults
Previous Message Daniel Verite 2019-12-16 16:05:25 Making psql error out on output failures