Tracking notnull attributes inside Var

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Tracking notnull attributes inside Var
Date: 2022-05-15 03:11:46
Message-ID: CAKU4AWoOfVA3ZS8tnrX1dME5vgfj+Jtn_JVMRvwyC6CakX0CbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

notnulls discussion is forked from UniqueKey stuff, you can see the
attachment
for the UnqiueKey introduction. Tom raised his opinion to track the
nullability
inside Var[1][2][3], this thread would start from there based on my
understanding.

Generally tracking the null attributes inside Var would have something like:

struct Var
{
...;
int nullable; // -1 unknown, 0 - not nullable. 1 - nullable
};

and then semantics of Var->nullable must be attached to a RelOptInfo. For
example:

CREATE TABLE t1(a int, b int);

SELECT abs(a) FROM t1 WHERE a > -100;

The var in RelOptInfo->reltarget should have nullable = 0 but the var in
RelOptInfo->baserestrictinfo should have nullable = 1; The beauty of this
are: a). It can distinguish the two situations perfectly b). Whenever we
want
to know the nullable attribute of a Var for an expression, it is super easy
to
know. In summary, we need to maintain the nullable attribute at 2 different
places. one is the before the filters are executed(baserestrictinfo,
joininfo,
ec_list at least). one is after the filters are executed
(RelOptInfo.reltarget
only?)

Come to JoinRel, we still need to maintain the 2 different cases as well.

As for the joinrel.reltarget, currently it looks up the inputrel's
reltarget to
get the Var, so it is easy to inherit from Var->nullable from inputrel, but
we need to consider the new changes introduced by current join,
Like new NOT nullable attributes because of join clauses OR new nullable
attributes because of outer join. Everything looks good for now.

The hard part is RelOptInfo.joininfo & root->eq_classes. All of them uses
the shared RestrictInfo, and it is unclear which Var->nullable should be
used in
them. To not provide a wrong answer, I think we can assume nullable=-1
(unknown)
and let the upper layer decides what to do (do we have known use cases to
use
the nullable attribute here?).

More considerations about this strategy:
1. We might use more memory for different var copies, the only known cases
RelOptInfo->reltarget for now.
2. _equalVar() has more complex semantics: shall we consider nulls or not.

My recent experience reminds me of another interesting use case of UniqueKey
which may reduce the planning time a lot IIUC (Value 3 in then attachment).
Since
PG15 has just been released, I wonder if more people have time to discuss
this topic
again. Do I think the way in the right direction?

[1] https://www.postgresql.org/message-id/1551312.1613142245%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/CAApHDvrRwhWCPKUD5H-EQoezHf%3DfnUUsPgTAnXsEOV8f8SF7XQ%40mail.gmail.com
[3] https://www.postgresql.org/message-id/1664320.1625577290%40sss.pgh.pa.us

--
Best Regards
Andy Fan

Attachment Content-Type Size
uniquekey.README application/octet-stream 2.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Shaplov 2022-05-15 09:41:39 Re: [PATCH] New [relation] option engine
Previous Message Ranier Vilela 2022-05-14 21:46:53 Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)