Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proving IS NOT NULL inference for ScalarArrayOpExpr's
Date: 2019-01-14 14:05:17
Message-ID: CAAaqYe_YuJ++pnUUE6u1r2ak62tHHvQ0=Jg=L-A2U-kTjSOHZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Sun, Jan 13, 2019 at 3:06 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> There's still a logical problem here, which is that in order to
> prove that the ScalarArrayOpExpr yields NULL when its LHS does,
> you also have to prove that the RHS is not an empty array.
>
> Otherwise you're up against the fact that the OR of zero boolean
> expressions is false not null:
>
> regression=# select null::int = any(array[]::int[]);
> ?column?
> ----------
> f
> (1 row)

I've constructed a test case running the test_predtest function on the
expression
"select x is null, x = any(array[]::int[]) from integers" which I
think gets at the logic bug you're noting. In doing this I noticed
something I don't fully understand: on master this shows that "x =
any(array[]::int[])" proves "x is null" (while my patch shows the
opposite). Is this because the second expression being false means
there's can be no valid value for x? It also claims it refutes the
same, which is even more confusing.

> While this is doable when the RHS is an array constant or ArrayExpr, I'm
> afraid it's likely to be a bit tedious. (Hm, actually predicate_classify
> already has logic to extract the number of elements in these cases ...
> I wonder if there's any use in trying to share code?)

It seems to me that this would mean that an IS NOT NULL index would
still not be proven to be usable for a non-constant array (e.g., from
a subquery calculated array), and in fact I've included a (failing)
test demonstrating that fact in the attached patch. This feeds into my
question above about there possibly being another corollary/proof that
could be added for this case.

Semi-related: I noticed that predicate_classify handles an ArrayExpr
by using list_length; this seems unnecessary if we know we can fail
fast. I realize it's not a common problem, but I have seen extremely
long arrays before, and maybe we should fall out of the count once we
hit max+1 items. My new code also has this "problem", but I figured it
was worth getting this logically correct before attempting to optimize
that especially since it already exists in one place.

> Minor stylistic quibble: I don't like where you inserted the new code in
> clause_is_strict_for, because it renders the comment a few lines above
> that unrelatable to the code. I'd put it at the bottom, after the
> is_funcclause stanza.

Fixed in this revision.

James Coleman

Attachment Content-Type Size
saop_is_not_null-v4.patch application/octet-stream 8.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2019-01-14 14:37:20 Re: insensitive collations
Previous Message Amit Khandekar 2019-01-14 13:06:59 Re: Pluggable Storage - Andres's take