Re: Fast path for empty relids in check_outerjoin_delay()

From: Richard Guo <riguo(at)pivotal(dot)io>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fast path for empty relids in check_outerjoin_delay()
Date: 2019-01-10 06:31:18
Message-ID: CAN_9JTwR-7ChFQTGu10bphH+Ak-NhENNJp+YsSvtxwSJgLfKRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 8, 2019 at 11:29 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Richard Guo <riguo(at)pivotal(dot)io> writes:
> > On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <
> > peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> >> I think code readability and maintainability would be improved by having
> >> fewer special cases and fast paths. In this particular case, I'd even
> >> remove the existing fast path and just let the subsequent loop run zero
> >> times. I doubt there is a performance benefit to the existing coding.
> >> And if there is no performance benefit, then it's not really a fast
> >> path, just another path.
>
> > This code was added in commit d7a6a0, by Tom.
> > Hi Tom, what is your opinion about the fast path in
> > check_outerjoin_delay()?
>
> I quite reject Peter's argument that the existing fast path is not
> worthwhile. It's a cheap test and it saves cycles whenever the query has
> no outer joins, which is common for simple queries. (The general rule
> I've followed for planner fast-path cases is to make simple queries as
> cheap as possible; you don't want the planner expending a whole lot of
> overhead on trivial cases.) Moreover, while it's true that the loop
> will fall through quickly if root->join_info_list is nil, there's still
> a bms_copy, bms_int_members, and bms_free that will be expended uselessly
> if we remove the fast-path check.
>
> I'm not, however, convinced that the case of *relids_p being empty is
> common enough to justify adding a test for that. In fact, it looks
> to me like it's guaranteed to be nonempty for the main call sites in
> distribute_qual_to_rels.
>
> Possibly what would make more sense is to add the consideration to
> check_equivalence_delay, which is the only call site that can pass
> an empty set; that is
>
> + /* A variable-free expression cannot be outerjoin-delayed */
> + if (restrictinfo->left_relids)
> + {
> /* must copy restrictinfo's relids to avoid changing it */
> relids = bms_copy(restrictinfo->left_relids);
> /* check left side does not need delay */
> if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
> return false;
> + }
>
> and likewise for the right side.
>
> The bigger picture here, of course, is that check_outerjoin_delay's
> API is not at all well matched to what check_equivalence_delay needs:
> it has to make the extra bms_copy shown above, and it has no use
> for either of the relid sets that check_outerjoin_delay wants to
> return. I believe it's like that because check_outerjoin_delay is
> much older than the EquivalenceClass logic, and I didn't want to
> redesign it when I put in ECs, and I also didn't want to just
> duplicate all that logic. But maybe it's time to put some more thought
> into that, and try to find a refactoring of check_outerjoin_delay that
> suits both callers better.
>

I am thinking about whether we can quickly tell a qual is
outerjoin_delayed or not by some bms_overlap operations.

A qual is regarded as being outerjoin_delayed if:

1) It references any nullable rels of some OJ below this qual, and
2) We haven't included all the rels of that OJ in relids.

For 1), we can collect all the nullable rels of OJs below this qual in
deconstruct_recurse() (like nullable_baserels, initsplan.c:976) and
bms_overlap this qual's relids with the collected nullable_relids.

For 2), I haven't figured out a good way.

Is this a viable direction? Any thoughts?

>
> Anyway, I concur with Peter that the patch you've presented should
> probably be rejected: I doubt it's a win performance-wise and I don't
> see that it adds anything to readability either. But if you want to
> think about a more effective refactoring of this logic, maybe there
> is something good to be had out of that.
>

Thanks Tom. I understand your concern.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Imai, Yoshikazu 2019-01-10 06:37:27 RE: speeding up planning with partitions
Previous Message Michael Paquier 2019-01-10 06:09:45 Re: BUG #15548: Unaccent does not remove combining diacritical characters