| From: | Tender Wang <tndrwang(at)gmail(dot)com> |
|---|---|
| To: | Richard Guo <guofenglinux(at)gmail(dot)com> |
| Cc: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Clean up remove_rel_from_query() after self-join elimination commit |
| Date: | 2026-04-07 01:43:27 |
| Message-ID: | CAHewXNmrGtEwhCFeJMtx-e=3TcD2Q0CUt6CVNbr6Hu0zbTg+tQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Richard Guo <guofenglinux(at)gmail(dot)com> 于2026年4月6日周一 16:11写道:
>
> While working on reusing remove_rel_from_query() for inner-join
> removal, I noticed that the function is in pretty rough shape. The
> self-join elimination (SJE) commit grafted self-join removal onto
> remove_rel_from_query(), which was originally written for left-join
> removal only.
Yes, I noticed this a few days ago when I tried to remove a redundant
filter added during SJE.
>This resulted in several issues:
> 1. Comments throughout remove_rel_from_query() still assumed only
> left-join removal, making the code misleading. For example:
>
> * This is relevant in case the outer join we're deleting is nested inside
> * other outer joins:
>
The current logic of remove_rel_from_query() is not easy to read. For example,
It distinguishes between left-join removal and SJE based on whether
sjinfo is NULL.
> 2. This was called even for left-join removal, with subst=-1. This is
> pointless and confusing.
>
> ChangeVarNodesExtended((Node *) phv->phexpr, relid, subst, 0,
> replace_relid_callback);
>
Yes, it is.
> 3. phinfo->ph_lateral was adjusted for left-join removal, which is
> also confusing ...
>
> phinfo->ph_lateral = adjust_relid_set(phinfo->ph_lateral, relid, subst);
>
> /*
> * ph_lateral might contain rels mentioned in ph_eval_at after the
> * replacement, remove them.
> */
> phinfo->ph_lateral = bms_difference(phinfo->ph_lateral,
> phinfo->ph_eval_at);
>
> ... since you can find this Assert just above:
>
> Assert(sjinfo == NULL || !bms_is_member(relid, phinfo->ph_lateral));
>
> 4. The comment about attr_needed reconstruction was in
> remove_rel_from_query(), but the actual rebuild is performed by
> the callers.
>
> 5. The EquivalenceClass processing in remove_rel_from_query():
>
> /*
> * Likewise remove references from EquivalenceClasses.
> */
> foreach(l, root->eq_classes)
> {
> EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
>
> if (bms_is_member(relid, ec->ec_relids) ||
> (sjinfo == NULL || bms_is_member(sjinfo->ojrelid, ec->ec_relids)))
> remove_rel_from_eclass(ec, sjinfo, relid, subst);
> }
>
> The condition always evaluates to true for self-join elimination
> (i.e., sjinfo == NULL), meaning every EquivalenceClass gets adjusted.
> But this is redundant because the caller remove_self_join_rel()
> already handles ECs via update_eclasses().
>
> 6. In remove_self_join_rel(), I notice this code:
>
> /* At last, replace varno in root targetlist and HAVING clause */
> ChangeVarNodesExtended((Node *) root->processed_tlist, toRemove->relid,
> toKeep->relid, 0, replace_relid_callback);
> ChangeVarNodesExtended((Node *) root->processed_groupClause,
> toRemove->relid, toKeep->relid, 0,
> replace_relid_callback);
>
> The comment mentions "HAVING clause", but neither processed_tlist nor
> processed_groupClause has anything to do with the HAVING clause.
> Furthermore, processed_groupClause contains SortGroupClause nodes,
> which have no Var nodes to rewrite, so calling ChangeVarNodesExtended
> on it is pointless.
I didn't find as many as you listed here. I agree that we should do
something for
current logic.
>
> The attached patch is an attempt to fix all these issues. It also
> aims to leave the code better structured for adding new types of join
> removal (such as inner-join removal) in the future.
I looked through the attached patch. LGTM.
--
Thanks,
Tender Wang
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-04-07 01:45:42 | Re: Eliminating SPI / SQL from some RI triggers - take 3 |
| Previous Message | Josh Kupershmidt | 2026-04-07 01:42:17 | Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements |