Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>
Subject: Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Date: 2021-03-04 16:03:11
Message-ID: 20210304160311.6tyl2lxx5e2oirye@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Feb 18, 2021 at 08:58:13PM +0800, Andy Fan wrote:

Thanks for continuing work on this patch!

> On Tue, Feb 16, 2021 at 12:01 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> > On Fri, 12 Feb 2021 at 15:18, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Feb 12, 2021 at 9:02 AM David Rowley <dgrowleyml(at)gmail(dot)com>
> > wrote:
> > >> The reason I don't really like this is that it really depends where
> > >> you want to use RelOptInfo.notnullattrs. If someone wants to use it
> > >> to optimise something before the base quals are evaluated then they
> > >> might be unhappy that they found some NULLs.
> > >>
> > >
> > > Do you mean the notnullattrs is not set correctly before the base quals
> > are
> > > evaluated? I think we have lots of data structures which are set just
> > after some
> > > stage. but notnullattrs is special because it is set at more than 1
> > stage. However
> > > I'm doubtful it is unacceptable, Some fields ever change their meaning
> > at different
> > > stages like Var->varno. If a user has a misunderstanding on it, it
> > probably will find it
> > > at the testing stage.
> >
> > You're maybe focusing too much on your use case for notnullattrs. It
> > only cares about NULLs in the result for each query level.
> >
> > .... thinks of an example...
> >
> > OK, let's say I decided that COUNT(*) is faster than COUNT(id) so
> > decided that I might like to write a patch which rewrite the query to
> > use COUNT(*) when it was certain that "id" could not contain NULLs.
> >
> > The query is:
> >
> > SELECT p.partid, p.partdesc,COUNT(s.saleid) FROM part p LEFT OUTER
> > JOIN sales s ON p.partid = s.partid GROUP BY p.partid;
> >
> > sale.saleid is marked as NOT NULL in pg_attribute. As the writer of
> > the patch, I checked the comment for notnullattrs and it says "Not
> > null attrs, start from -FirstLowInvalidHeapAttributeNumber", so I
> > should be ok to assume since sales.saleid is marked in notnullattrs
> > that I can rewrite the query?!
> >
> > The documentation about the RelOptInfo.notnullattrs needs to be clear
> > what exactly it means. I'm not saying your representation of how to
> > record NOT NULL in incorrect. I'm saying that you need to be clear
> > what exactly is being recorded in that field.
> >
> > If you want it to mean "attribute marked here cannot output NULL
> > values at this query level", then you should say something along those
> > lines.
> >
> > However, having said that, because this is a Bitmapset of
> > pg_attribute.attnums, it's only possible to record Vars from base
> > relations. It does not seem like you have any means to record
> > attributes that are normally NULLable, but cannot produce NULL values
> > due to a strict join qual.
> >
> > e.g: SELECT t.nullable FROM t INNER JOIN j ON t.nullable = j.something;
> >
> > I'd expect the RelOptInfo for t not to contain a bit for the
> > "nullable" column, but there's no way to record the fact that the join
> > RelOptInfo for {t,j} cannot produce a NULL for that column. It might
> > be quite useful to know that for the UniqueKeys patch.
> >
>
> I checked again and found I do miss the check on JoinExpr->quals. I have
> fixed it in v3 patch. Thanks for the review!
>
> In the attached v3, commit 1 is the real patch, and commit 2 is just add
> some logs to help local testing. notnull.sql/notnull.out is the test case
> for
> this patch, both commit 2 and notnull.* are not intended to be committed
> at last.

Just to clarify, this version of notnullattrs here is the latest one,
and another one from "UniqueKey on Partitioned table" thread should be
disregarded?

> Besides the above fix in v3, I changed the comments alongs the notnullattrs
> as below and added a true positive helper function is_var_nullable.

With "true positive" you mean it will always correctly say if a Var is
nullable or not? I'm not sure about this, but couldn't be there still
some cases when a Var belongs to nullable_baserels, but still has some
constraints preventing it from being nullable (e.g. a silly example when
the not nullable column belong to the table, and the query does full
join of this table on itself using this column)?

Is this function necessary for the following patches? I've got an
impression that the discussion in this thread was mostly evolving about
correct description when notnullattrs could be used, not making it
bullet proof.

> Bitmapset *notnullattrs;

It looks like RelOptInfo has its own out function _outRelOptInfo,
probably the notnullattrs should be also present there as BITMAPSET_FIELD?

As a side note, I've attached those two new threads to CF item [1],
hopefully it's correct.

[1]: https://commitfest.postgresql.org/32/2433/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-03-04 16:10:19 Re: buildfarm windows checks / tap tests on windows
Previous Message Fujii Masao 2021-03-04 16:02:25 Re: About to add WAL write/fsync statistics to pg_stat_wal view