Re: Removing unneeded self joins

From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2023-05-25 09:40:35
Message-ID: f1eee810-ccd5-cdee-775f-80c87c74f49b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/6/23 10:30, Michał Kłeczek wrote:
> Hi All,
>
> I just wanted to ask about the status and plans for this patch.
> I can see it being stuck at “Waiting for Author” status in several
> commit tests.
>
> I think this patch would be really beneficial for us as we heavily use
> views to structure out code.
> Each view is responsible for providing some calculated values and
they > are joined in a query to retrieve the full set of information.
>
> Not sure how the process works and how I could help (I am absolutely
> not capable of helping with coding I am afraid - but could sponsor a
> (small :) ) bounty to speed things up).

Yes, I am still working on this feature. Because of significant changes
in the optimizer code which Tom & Richard had been doing last months, I
didn't touch it for a while. But now this work can be continued.

Current patch is rebased on current master. Because of the nullable_rels
logic, introduced recently, ojrelids were highly spreaded across planner
bitmapsets. So, JE logic was changed.

But now, I'm less happy with the code. It seems we need to refactor it:
1. According to reports of some performance engineers, the feature can
cause overhead ~0.5% on trivial queries without joins at all. We should
discover the patch and find the way for quick and cheap return, if the
statement contains no one join or, maybe stronger, no one self join.
2. During join elimination we replace clauses like 'x=x' with 'x IS NOT
NULL'. It is a weak point because we change clause semantic
(mergejoinable to non-mergejoinable, in this example) and could forget
consistently change some RestrictInfo fields.
3. In the previous versions we changed the remove_rel_from_query routine
trying to use it in both 'Useless outer join' and 'Self join'
elimination optimizations. Now, because of the 'ojrelid' field it looks
too complicated. Do we need to split this routine again?

--
Regards
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
v40-0001-Remove-self-joins.patch text/x-patch 93.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-05-25 10:39:19 Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error
Previous Message Daniel Gustafsson 2023-05-25 09:18:28 Re: pgindent vs. pgperltidy command-line arguments