Re: A bug in mapping attributes in ATExecAttachPartition()

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: Re: A bug in mapping attributes in ATExecAttachPartition()
Date: 2017-06-09 05:01:34
Message-ID: 15af9ad0-f65d-13e8-988c-56d9d5c2b89f@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/06/08 19:25, Ashutosh Bapat wrote:
> On Thu, Jun 8, 2017 at 3:13 PM, Amit Langote
>>> I think this code could be removed entirely in light of commit
>>> 3ec76ff1f2cf52e9b900349957b42d28128b7bc7.
>>
>> I am assuming you think that because now we emit IS NOT NULL constraint
>> internally for any partition keys that do not allow null values (including
>> all the keys of range partitions as of commit
>> 3ec76ff1f2cf52e9b900349957b42d28128b7bc7). But those IS NOT NULL
>> constraint expressions are inconclusive as far as the application of
>> predicate_implied_by() to determine if we can skip the scan is concerned.
>> So even if predicate_implied_by() returned 'true', we cannot conclude,
>> just based on that result, that there are not any null values in the
>> partition keys.
>
> I am not able to understand this. Are you saying that
> predicate_implied_by() returns true even when it's not implied when
> NOT NULL constraints are involved? That sounds like a bug in
> predicate_implied_by(), which should be fixed instead of adding code
> to pepper over it?

No, it's not a bug of predicate_implied_by(). I meant to say
predicate_implied_by() isn't exactly designed for ATExecAttachPartition's
purpose, especially its treatment of IS NOT NULL constraints is not
suitable for this application. To prove that the table cannot contain
NULLs when it shouldn't because of the partition constraint, we must look
for explicit NOT NULL constraints on the partition key columns, instead of
relying on the predicate_implied_by()'s proof. See the following
explanation for why that is so (or at least I think is so):

There is this text in the header comment of
predicate_implied_by_simple_clause(), which is where the individual pairs
of expressions are compared and/or passed to operator_predicate_proof(),
which mentions that the treatment of IS NOT NULL predicate is based on the
assumption that 'restrictions' that are passed to predicate_implied_by()
are a query's WHERE clause restrictions, *not* CHECK constraints that are
checked when inserting data into a table.

* When the predicate is of the form "foo IS NOT NULL", we can conclude that
* the predicate is implied if the clause is a strict operator or function
* that has "foo" as an input. In this case the clause must yield NULL when
* "foo" is NULL, which we can take as equivalent to FALSE because we know
* we are within an AND/OR subtree of a WHERE clause. (Again, "foo" is
* already known immutable, so the clause will certainly always fail.)
* Also, if the clause is just "foo" (meaning it's a boolean variable),
* the predicate is implied since the clause can't be true if "foo" is NULL.

As mentioned above, note the following part: which we can take as
equivalent to FALSE because we know we are within an AND/OR subtree of a
WHERE clause.

Which is not what we are passing to predicate_implied_by() in
ATExecAttachPartition(). We are passing it the table's CHECK constraint
clauses which behave differently for the NULL result on NULL input - they
*allow* the row to be inserted. Which means that there will be rows with
NULLs in the partition key, even if predicate_refuted_by() said that there
cannot be. We will end up *wrongly* skipping the validation scan if we
relied on just the predicate_refuted_by()'s result. That's why there is
code to check for explicit NOT NULL constraints on the partition key
columns. If there are, it's OK then to assume that all the partition
constraints are satisfied by existing constraints. One more thing: if any
partition key happens to be an expression, which there cannot be NOT NULL
constraints for, we just give up on skipping the scan, because we don't
have any declared knowledge about whether those keys are also non-null,
which we want for partitions that do not accept null values.

Does that make sense?

Thanks,
Amit

PS: Also interesting to note is the difference in behavior between
ExecQual() and ExecCheck() on NULL result.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Supun Nakandala 2017-06-09 05:56:32 How to refer to resource files from UDFs written in C
Previous Message Mithun Cy 2017-06-09 04:31:35 Re: Proposal : For Auto-Prewarm.