Re: Teach predtest about IS [NOT] <boolean> proofs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Teach predtest about IS [NOT] <boolean> proofs
Date: 2024-01-22 17:57:41
Message-ID: 3797929.1705946261@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

James Coleman <jtc331(at)gmail(dot)com> writes:
> 0001 does the initial pure refactor. 0003 makes a lot of modifications
> to what we can prove about implication and refutation. Finally, 0003
> isn't intended to be committed, but attempts to validate more
> holistically that none of the changes creates any invalid proofs
> beyond the mostly happy-path tests added in 0004.

> I ended up not tackling changing how test_predtest tests run for now.
> That's plausibly still useful, and I'd be happy to add that if you
> generally agree with the direction of the patch and with that
> abstraction being useful.

> I added some additional verifications to the test_predtest module to
> prevent additional obvious flaws.

I looked through 0001 and made some additional cosmetic changes,
mostly to get comments closer to the associated code; I also
ran pgindent on it (see v5-0001 attached). That seems pretty
committable to me at this point. I also like your 0002 additions to
test_predtest.c (although why the mixture of ERROR and WARNING?
ISTM they should all be WARNING, so we can press on with the test).

One other thought is that maybe separating out
predicate_implied_not_null_by_clause should be part of 0001?

I'm less excited about the rest of v4-0002.

@@ -740,6 +747,16 @@ predicate_refuted_by_recurse(Node *clause, Node *predicate,
!weak))
return true;

+ /*
+ * Because weak refutation expands the allowed outcomes for B
+ * from "false" to "false or null", we can additionally prove
+ * weak refutation in the case that strong refutation is proven.
+ */
+ if (weak && not_arg &&
+ predicate_implied_by_recurse(predicate, not_arg,
+ true))
+ return true;
+
switch (pclass)
{
case CLASS_AND:

I don't buy this bit at all. If the prior recursive call fails to
prove weak refutation in a case where strong refutation holds, how is
that not a bug lower down? Moreover, in order to mask such a bug,
you're doubling the time taken by failed proofs, which is an
unfortunate thing --- we don't like spending a lot of time on
something that fails to improve the plan.

@@ -1138,32 +1155,114 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
Assert(list_length(op->args) == 2);
rightop = lsecond(op->args);

- /*
- * We might never see a null Const here, but better check
- * anyway
- */
- if (rightop && IsA(rightop, Const) &&
- !((Const *) rightop)->constisnull)
+ if (rightop && IsA(rightop, Const))
{
+ Const *constexpr = (Const *) rightop;
Node *leftop = linitial(op->args);

- if (DatumGetBool(((Const *) rightop)->constvalue))
- {
- /* X = true implies X */
- if (equal(predicate, leftop))
- return true;
- }
+ if (constexpr->constisnull)
+ return false;
+
+ if (DatumGetBool(constexpr->constvalue))
+ return equal(predicate, leftop);
else
- {
- /* X = false implies NOT X */
- if (is_notclause(predicate) &&
- equal(get_notclausearg(predicate), leftop))
- return true;
- }
+ return is_notclause(predicate) &&
+ equal(get_notclausearg(predicate), leftop);
}
}
}
break;

I don't understand what this bit is doing ... and the fact that
the patch removes all the existing comments and adds none isn't
helping that. What it seems to mostly be doing is adding early
"return false"s, which I'm not sure is a good thing, because
it seems possible that operator_predicate_proof could apply here.

+ case IS_UNKNOWN:
+ /*
+ * When the clause is in the form "foo IS UNKNOWN" then
+ * we can prove weak implication of a predicate that
+ * is strict for "foo" and negated. This doesn't work
+ * for strong implication since if "foo" is "null" then
+ * the predicate will evaluate to "null" rather than
+ * "true".
+ */

The phrasing of this comment seems randomly inconsistent with others
making similar arguments.

+ case IS_TRUE:
/*
- * If the predicate is of the form "foo IS NOT NULL",
- * and we are considering strong implication, we can
- * conclude that the predicate is implied if the
- * clause is strict for "foo", i.e., it must yield
- * false or NULL when "foo" is NULL. In that case
- * truth of the clause ensures that "foo" isn't NULL.
- * (Again, this is a safe conclusion because "foo"
- * must be immutable.) This doesn't work for weak
- * implication, though. Also, "row IS NOT NULL" does
- * not act in the simple way we have in mind.
+ * X implies X is true
+ *
+ * We can only prove strong implication here since
+ * `null is true` is false rather than null.
*/

This hardly seems like an improvement on the comment. (Also, here and
elsewhere, could we avoid using two different types of quotes?)

+ /* X is unknown weakly implies X is not true */
+ if (weak && clausebtest->booltesttype == IS_UNKNOWN &&
+ equal(clausebtest->arg, predbtest->arg))
+ return true;

Maybe I'm confused, but why is it only weak?

+ /*
+ * When we know what the predicate is in the form
+ * "foo IS UNKNOWN" then we can prove strong and
+ * weak refutation together. This is because the
+ * limits imposed by weak refutation (allowing
+ * "false" instead of just "null") is equivalently
+ * helpful since "foo" being "false" also refutes
+ * the predicate. Hence we pass weak=false here
+ * always.
+ */

This comment doesn't make sense to me either.

+ /* TODO: refactor this into switch statements also? */

Let's drop the TODO comments.

+ /*
+ * We can recurse into "not foo" without any additional processing because
+ * "not (null)" evaluates to null. That doesn't work for allow_false,
+ * however, since "not (false)" is true rather than null.
+ */
+ if (is_notclause(clause) &&
+ clause_is_strict_for((Node *) get_notclausearg(clause), subexpr, false))
+ return true;

Not exactly convinced by this. The way the comment is written, I'd
expect to not call clause_is_strict_for at all if allow_false. If
it's okay to call it anyway and pass allow_false = false, you need
to defend that, which this comment isn't doing.

regards, tom lane

Attachment Content-Type Size
v5-0001-Use-switch-statements-in-predicate_-implied-refut.patch text/x-diff 14.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-01-22 18:04:40 Re: psql: Allow editing query results with \gedit
Previous Message Fujii Masao 2024-01-22 17:52:25 Re: Network failure may prevent promotion