Re: Teach predtest about IS [NOT] <boolean> proofs

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Teach predtest about IS [NOT] <boolean> proofs
Date: 2023-12-22 15:00:50
Message-ID: CAAaqYe_H=_SsisUuJUtTfsyw50+o+iqJtMgKd1B0qBAv3V_WNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 14, 2023 at 4:38 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> James Coleman <jtc331(at)gmail(dot)com> writes:
> > On Wed, Dec 13, 2023 at 1:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I don't have an objection in principle to adding more smarts to
> >> predtest.c. However, we should be wary of slowing down cases where
> >> no BooleanTests are present to be optimized. I wonder if it could
> >> help to use a switch on nodeTag rather than a series of if(IsA())
> >> tests. (I'd be inclined to rewrite the inner if-then-else chains
> >> as switches too, really. You get some benefit from the compiler
> >> noticing whether you've covered all the enum values.)
>
> > I think I could take this on; would you prefer it as a patch in this
> > series? Or as a new patch thread?
>
> No, keep it in the same thread (and make a CF entry, if you didn't
> already). It might be best to make a series of 2 patches, first
> just refactoring what's there per this discussion, and then a
> second one to add BooleanTest logic.

CF entry is already created; I'll keep it here then.

> >> I note you've actively broken the function's ability to cope with
> >> NULL input pointers. Maybe we don't need it to, but I'm not going
> >> to accept a patch that just side-swipes that case without any
> >> justification.
>
> > [ all callers have previously used predicate_classify ]
>
> OK, fair enough. The checks for nulls are probably from ancient
> habit, but I agree we could remove 'em here.
>
> >> Perhaps, rather than hoping people will notice comments that are
> >> potentially offscreen from what they're modifying, we should relocate
> >> those comment paras to be adjacent to the relevant parts of the
> >> function?
>
> > Splitting up that block comment makes sense to me.
>
> Done, let's make it so.
>
> >> I've not gone through the patch in detail to see whether I believe
> >> the proposed proof rules. It would help to have more comments
> >> justifying them.
>
> > Most of them are sufficiently simple -- e.g., X IS TRUE implies X --
> > that I don't think there's a lot to say in justification. In some
> > cases I've noted the cases that force only strong or weak implication.
>
> Yeah, it's the strong-vs-weak distinction that makes me cautious here.
> One's high-school-algebra instinct for what's obviously true tends to
> not think about NULL/UNKNOWN, and you do have to consider that.
>
> >>> As noted in a TODO in the patch itself, I think it may be worth refactoring
> >>> the test_predtest module to run the "x, y" case as well as the "y, x" case
>
> >> I think that's actively undesirable. It is not typically the case that
> >> a proof rule for A => B also works in the other direction, so this would
> >> encourage wasting cycles in the tests. I fear it might also cause
> >> confusion about which direction a proof rule is supposed to work in.
>
> > That makes sense in the general case.
>
> > Boolean expressions seem like a special case in that regard: (subject
> > to what it looks like) would you be OK with a wrapping function that
> > does both directions (with output that shows which direction is being
> > tested) used only for the cases where we do want to check both
> > directions?
>
> Using a wrapper where appropriate would remove the inefficiency
> concern, but I still worry whether it will promote confusion about
> which direction we're proving things in. You'll need to be very clear
> about the labeling.

I've not yet applied all of your feedback, but I wanted to get an
initial read on your thoughts on how using switch statements ends up
looking. Attached is a single (pure refactor) patch that converts the
various if/else levels that check things like node tag and
boolean/null test type into switch statements. I've retained 'default'
keyword usages for now for simplicity (my intuition is that we
generally prefer to list out all options for compiler safety benefits,
though I'm not 100% sure that's useful in the outer node tag check
since it's unlikely that someone adding a new node would modify
this...).

My big question is: are you comfortable with the indentation explosion
this creates? IMO it's a lot wordier, but it is also more obvious what
the structural goal is. I'm not sure how we want to make the right
trade-off though.

Once there's agreement on this part, I'll add back a second patch
applying my changes on top of the refactor as well as apply other
feedback (e.g., splitting up the block comment).

Regards,
James Coleman

Attachment Content-Type Size
v2-0001-WIP-use-switch-statements.patch application/octet-stream 7.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-12-22 15:24:15 Re: Avoid computing ORDER BY junk columns unnecessarily
Previous Message Japin Li 2023-12-22 14:50:08 Re: [DOC] Introducing Quick Start Guide to PL/pgSQL and PL/Python Documentation