Re: Removing unneeded self joins

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Subject: Re: Removing unneeded self joins
Date: 2025-06-26 05:40:36
Message-ID: aFzdVCWEhbUgn91k@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 26, 2025 at 11:05:27PM +0300, Alexander Korotkov wrote:
>> I did some improvements to PHVs patch: revised comments and commit
>> message. I'm going to push it if no objections.
>
> Uh, v2 was there already. That should be v3.

I was doing some work on pg_hint_plan, evaluating the amount of
breakages caused by v18.

Commit fc069a3a6319 (Implement Self-Join Elimination) is standing out
because it has basically broken the reliability of join_search_hook by
doing the self-join eliminations +before+ we have a chance to call the
hook on a joinlist in make_rel_from_joinlist(). For example, take a
query as simple as that:
SELECT * FROM t1 tab1, t1 tab2 WHERE tab1.c1 = tab2.c1;

Before this commit, it was possible to apply hints to each individual
aliases, driving the planner behavior, with levels_needed at 2 as an
effect of the join list having two members. After this commit, the
self-join elimination does its job first, reduces the join list to 1,
removes the knowledge of the join and prevents any control that we had
before in this code path.

Perhaps it is fair to say that this new limitation in pg_hint_plan
should be documented and its tests reworked to avoid that, but it
seems to me that the change of behavior of join_search_hook could
also hurt some existing use cases. I will not disagree that it may be
more useful to know that the joins are reduced when reaching the
modules while calling the hook, but cdf0231c88bd, that has introduced
the join_search_hook back in 2007 was used for a different reason than
the case I'm seeing as broken at the top of v18 today.

Anyway, it seems to me that we may need to do something here before
the release. Note that if the consensus is "you should update your
module and not rely on the past behavior", I'm OK with that. I just
wanted to raise the issue before this goes GA. And well, I have a
pretty big pool of users that rely on this module, so..
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-06-26 05:45:12 RE: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Shinya Kato 2025-06-26 05:35:34 Re: Extend COPY FROM with HEADER <integer> to skip multiple lines