Re: Removing unneeded self joins

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2023-10-23 09:47:53
Message-ID: CAPpHfduoAHTp6O7Q8PDe_w=30v0n_06pGe5M65iR4DYcfaYnRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 23, 2023 at 6:43 AM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> On 22/10/2023 05:01, Alexander Korotkov wrote:
> > On Thu, Oct 19, 2023 at 6:16 AM Andrei Lepikhov
> > <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> >> On 19/10/2023 01:50, Alexander Korotkov wrote:
> >>> This query took 3778.432 ms with self-join removal disabled, and
> >>> 3756.009 ms with self-join removal enabled. So, no measurable
> >>> overhead. Similar to the higher number of joins. Can you imagine
> >>> some extreme case when self-join removal could introduce significant
> >>> overhead in comparison with other optimizer parts? If not, should we
> >>> remove self_join_search_limit GUC?
> >> Thanks,
> >> It was Zhihong Yu who worried about that case [1]. And my purpose was to
> >> show a method to avoid such a problem if it would be needed.
> >> I guess the main idea here is that we have a lot of self-joins, but only
> >> few of them (or no one) can be removed.
> >> I can't imagine a practical situation when we can be stuck in the
> >> problems here. So, I vote to remove this GUC.
> >
> > I've removed the self_join_search_limit. Anyway there is
> > enable_self_join_removal if the self join removal algorithm causes any
> > problems. I also did some grammar corrections for the comments. I
> > think the patch is getting to the committable shape. I noticed some
> > failures on commitfest.cputube.org. I'd like to check how this
> > version will pass it.
>
> I have observed the final patch. A couple of minor changes can be made
> (see attachment).

Thank you, Andrei! I've integrated your changes into the patch.

> Also, I see room for improvement, but it can be done later. For example,
> we limit the optimization to only ordinary tables in this patch. It can
> be extended at least with partitioned and foreign tables soon.

Yes, I think it's reasonable to postpone some improvements. It's
important to get the basic feature in, make sure it's safe and stable.
Then we can make improvements incrementally.

I think this patch makes substantial improvement to query planning.
It has received plenty of reviews. The code is currently in quite
good shape. I didn't manage to find the cases when this optimization
causes significant overhead to planning time. Even if such cases will
be spotted there is a GUC option to disable this feature. So, I'll
push this if there are no objections.

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
0001-Remove-useless-self-joins-v47.patch application/octet-stream 100.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2023-10-23 09:55:32 Re: Add trailing commas to enum definitions
Previous Message Laurenz Albe 2023-10-23 09:40:02 Re: Fix output of zero privileges in psql