Proving IS NOT NULL inference for ScalarArrayOpExpr's

From: James Coleman <jtc331(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proving IS NOT NULL inference for ScalarArrayOpExpr's
Date: 2018-11-10 21:33:26
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

I've recently been investigating improving our plans for queries like:
SELECT * FROM t WHERE IN (1, 2..1000);
where the table "t" has a partial index on "foo" where "foo IS NOT NULL".

Currently the planner generates an index [only] scan so long as the number
items in the IN expression is <= 100, but as soon as you add the 101st item
it reverts to seq scan. If we add the explicit null check like:
SELECT * FROM t WHERE IN (1, 2..1000) AND foo IS NOT NULL;
then we go back to the desired index scan.

This occurs because predtest.c stops expanding ScalarArrayOpExpr's with
constant array arguments into OR trees when the array size is > 100. The
of the predicate proving code then becomes unable to infer that foo is not
and therefore the planner cannot prove that the partial index is correct to

(Please pardon technical details in the below background that may be way
I don't have a lot of experience with the Postgres codebase yet, and am
trying to build a mental model of things.)

At first I was imagining having the parse keep track of whether an array
expr contained any nulls and perhaps adding generated quals (in an
class?) to allow the planner to easily prove the index was correct. I'd been
going down this track because in my mind the issue was because the planner
needed to verify whether all of the array elements were not null.

But as I started to dig into the predtest.c NOT NULL proofs and add test
I realized that at least in many normal op cases we can safely infer that
is not null when "foo <op> <array>" is true even if the array contains null

This is such a simple change that it seems like I must be missing a case
the above doesn't hold true, but I can't immediately think of any, and
with the attached patch all existing tests pass (including some additional
ones I added for predtest to play around with it).

Am I missing something obvious? Is this a valid approach?

Other outstanding questions:

Should I add additional tests for predtest? It already seems to cover some
test cases with scalar array ops, but I'd be happy to add more if desired.

Should I add a test case for the resulting plan with "foo IN (...)" with an
array with more than 100 elements?

James Coleman

Attachment Content-Type Size
saop_is_not_null-v1.patch application/octet-stream 548 bytes


Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-11-10 22:29:54 Re: Skylake-S warning
Previous Message Andres Freund 2018-11-10 19:37:03 Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA