Re: Teach predtest about IS [NOT] <boolean> proofs

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Teach predtest about IS [NOT] <boolean> proofs
Date: 2023-12-14 00:35:01
Message-ID: CAAaqYe-CGx7sEm1QRNaY=Pfti4KinLOPtgVr4LKoXY6wLa68Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for taking a look!

On Wed, Dec 13, 2023 at 1:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> James Coleman <jtc331(at)gmail(dot)com> writes:
> > Attached is a patch that solves that issue. It also teaches predtest about
> > quite a few more cases involving BooleanTest expressions (e.g., how they
> > relate to NullTest expressions). One thing I could imagine being an
> > objection is that not all of these warrant cycles in planning. If that
> > turns out to be the case there's not a particularly clear line in my mind
> > about where to draw that line.
>
> I don't have an objection in principle to adding more smarts to
> predtest.c. However, we should be wary of slowing down cases where
> no BooleanTests are present to be optimized. I wonder if it could
> help to use a switch on nodeTag rather than a series of if(IsA())
> tests. (I'd be inclined to rewrite the inner if-then-else chains
> as switches too, really. You get some benefit from the compiler
> noticing whether you've covered all the enum values.)

I think I could take this on; would you prefer it as a patch in this
series? Or as a new patch thread?

> I note you've actively broken the function's ability to cope with
> NULL input pointers. Maybe we don't need it to, but I'm not going
> to accept a patch that just side-swipes that case without any
> justification.

I should have explained that. I don't think I've broken it:

1. predicate_implied_by_simple_clause() is only ever called by
predicate_implied_by_recurse()
2. predicate_implied_by_recurse() starts with:
pclass = predicate_classify(predicate, &pred_info);
3. predicate_classify(Node *clause, PredIterInfo info) starts off with:
Assert(clause != NULL);

I believe this means we are currently guaranteed by the caller to
receive a non-NULL pointer, but I could be missing something.

The same argument (just substituting the equivalent "refute" function
names) applies to predicate_refuted_by_simple_clause().

> Another way in which the patch needs more effort is that you've
> not bothered to update the large comment block atop the function.
> Perhaps, rather than hoping people will notice comments that are
> potentially offscreen from what they're modifying, we should relocate
> those comment paras to be adjacent to the relevant parts of the
> function?

Splitting up that block comment makes sense to me.

> I've not gone through the patch in detail to see whether I believe
> the proposed proof rules. It would help to have more comments
> justifying them.

Most of them are sufficiently simple -- e.g., X IS TRUE implies X --
that I don't think there's a lot to say in justification. In some
cases I've noted the cases that force only strong or weak implication.

There are a few cases, though, (e.g., "X is unknown weakly implies X
is not true") that, reading over this again, don't immediately strike
me as obvious, so I'll expand on those.

> > As noted in a TODO in the patch itself, I think it may be worth refactoring
> > the test_predtest module to run the "x, y" case as well as the "y, x" case
> > with a single call so as to eliminate a lot of repetition in
> > clause/expression test cases. If reviewers agree that's desirable, then I
> > could do that as a precursor.
>
> I think that's actively undesirable. It is not typically the case that
> a proof rule for A => B also works in the other direction, so this would
> encourage wasting cycles in the tests. I fear it might also cause
> confusion about which direction a proof rule is supposed to work in.

That makes sense in the general case.

Boolean expressions seem like a special case in that regard: (subject
to what it looks like) would you be OK with a wrapping function that
does both directions (with output that shows which direction is being
tested) used only for the cases where we do want to check both
directions?

Thanks,
James Coleman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-12-14 00:48:36 Simplify newNode()
Previous Message Masahiko Sawada 2023-12-14 00:22:11 Re: [PoC] Improve dead tuple storage for lazy vacuum