Re: Clean up remove_rel_from_query() after self-join elimination commit

From: Tender Wang <tndrwang(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Clean up remove_rel_from_query() after self-join elimination commit
Date: 2026-04-07 01:43:27
Message-ID: CAHewXNmrGtEwhCFeJMtx-e=3TcD2Q0CUt6CVNbr6Hu0zbTg+tQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Guo <guofenglinux(at)gmail(dot)com> 于2026年4月6日周一 16:11写道:
>
> While working on reusing remove_rel_from_query() for inner-join
> removal, I noticed that the function is in pretty rough shape. The
> self-join elimination (SJE) commit grafted self-join removal onto
> remove_rel_from_query(), which was originally written for left-join
> removal only.

Yes, I noticed this a few days ago when I tried to remove a redundant
filter added during SJE.

>This resulted in several issues:
> 1. Comments throughout remove_rel_from_query() still assumed only
> left-join removal, making the code misleading. For example:
>
> * This is relevant in case the outer join we're deleting is nested inside
> * other outer joins:
>

The current logic of remove_rel_from_query() is not easy to read. For example,
It distinguishes between left-join removal and SJE based on whether
sjinfo is NULL.

> 2. This was called even for left-join removal, with subst=-1. This is
> pointless and confusing.
>
> ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
> replace_relid_callback);
>
Yes, it is.
> 3. phinfo->ph_lateral was adjusted for left-join removal, which is
> also confusing ...
>
> phinfo->ph_lateral = adjust_relid_set(phinfo->ph_lateral, relid, subst);
>
> /*
> * ph_lateral might contain rels mentioned in ph_eval_at after the
> * replacement, remove them.
> */
> phinfo->ph_lateral = bms_difference(phinfo->ph_lateral,
> phinfo->ph_eval_at);
>
> ... since you can find this Assert just above:
>
> Assert(sjinfo == NULL || !bms_is_member(relid, phinfo->ph_lateral));
>
> 4. The comment about attr_needed reconstruction was in
> remove_rel_from_query(), but the actual rebuild is performed by
> the callers.
>
> 5. The EquivalenceClass processing in remove_rel_from_query():
>
> /*
> * Likewise remove references from EquivalenceClasses.
> */
> foreach(l, root->eq_classes)
> {
> EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
>
> if (bms_is_member(relid, ec->ec_relids) ||
> (sjinfo == NULL || bms_is_member(sjinfo->ojrelid, ec->ec_relids)))
> remove_rel_from_eclass(ec, sjinfo, relid, subst);
> }
>
> The condition always evaluates to true for self-join elimination
> (i.e., sjinfo == NULL), meaning every EquivalenceClass gets adjusted.
> But this is redundant because the caller remove_self_join_rel()
> already handles ECs via update_eclasses().
>
> 6. In remove_self_join_rel(), I notice this code:
>
> /* At last, replace varno in root targetlist and HAVING clause */
> ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
> toKeep->relid, 0, replace_relid_callback);
> ChangeVarNodesExtended((Node *) root->processed_groupClause,
> toRemove->relid, toKeep->relid, 0,
> replace_relid_callback);
>
> The comment mentions "HAVING clause", but neither processed_tlist nor
> processed_groupClause has anything to do with the HAVING clause.
> Furthermore, processed_groupClause contains SortGroupClause nodes,
> which have no Var nodes to rewrite, so calling ChangeVarNodesExtended
> on it is pointless.

I didn't find as many as you listed here. I agree that we should do
something for
current logic.
>
> The attached patch is an attempt to fix all these issues. It also
> aims to leave the code better structured for adding new types of join
> removal (such as inner-join removal) in the future.

I looked through the attached patch. LGTM.

--
Thanks,
Tender Wang

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-04-07 01:45:42 Re: Eliminating SPI / SQL from some RI triggers - take 3
Previous Message Josh Kupershmidt 2026-04-07 01:42:17 Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements