Re: Removing unneeded self joins

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2023-10-04 07:34:15
Message-ID: CAPpHfdt-0kVV7O==aJEbjY2iGYBu+XBzTHEbPv_6sVNeC7fffQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Wed, Oct 4, 2023 at 9:56 AM Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
wrote:
> On 4/10/2023 07:12, Alexander Korotkov wrote:
> > On Tue, Sep 12, 2023 at 4:58 PM Andrey Lepikhov
> > <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> >> On 5/7/2023 21:28, Andrey Lepikhov wrote:
> >>> During the significant code revision in v.41 I lost some replacement
> >>> operations. Here is the fix and extra tests to check this in the
future.
> >>> Also, Tom added the JoinDomain structure five months ago, and I added
> >>> code to replace relids for that place too.
> >>> One more thing, I found out that we didn't replace SJs, defined by
> >>> baserestrictinfos if no one self-join clause have existed for the
join.
> >>> Now, it is fixed, and the test has been added.
> >>> To understand changes readily, see the delta file in the attachment.
> >> Here is new patch in attachment. Rebased on current master and some
> >> minor gaffes are fixed.
> >
> > I went through the thread and I think the patch gets better shape. A
> > couple of notes from my side.
> > 1) Why replace_relid() makes a copy of lids only on insert/replace of
> > a member, but performs deletion in-place?
>
> Shortly speaking, it was done according to the 'Paranoid' strategy.
> The main reason for copying before deletion was the case with the rinfo
> required_relids and clause_relids. They both point to the same Bitmapset
> in some cases. And we feared such things for other fields.
> Right now, it may be redundant because we resolved the issue mentioned
> above in replace_varno_walker.

OK, but my point is still that you should be paranoid in all the cases or
none of the cases. Right now (newId < 0) branch doesn't copy source
relids, but bms_is_member(oldId, relids) does copy. Also, I think whether
we copy or not should be reflected in the function comment.

/*
* Substitute newId by oldId in relids.
*/
static Bitmapset *
replace_relid(Relids relids, int oldId, int newId)
{
if (oldId < 0)
return relids;

if (newId < 0)
/* Delete relid without substitution. */
return bms_del_member(relids, oldId);

if (bms_is_member(oldId, relids))
return bms_add_member(bms_del_member(bms_copy(relids), oldId),
newId);

return relids;
}

> Relid replacement machinery is the most contradictory code here. We used
> a utilitarian approach and implemented a simplistic variant.

> > 2) It would be nice to skip the insertion of IS NOT NULL checks when
> > they are not necessary. [1] points that infrastructure from [2] might
> > be useful. The patchset from [2] seems committed mow. However, I
> > can't see it is directly helpful in this matter. Could we just skip
> > adding IS NOT NULL clause for the columns, that have
> > pg_attribute.attnotnull set?
> Thanks for the links, I will look into that case.

OK, thank you.

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Orlov Aleksej 2023-10-04 08:08:49 RE: [PATCH] Fix memory leak in memoize for numeric key
Previous Message Laurenz Albe 2023-10-04 07:20:16 Re: unnest multirange, returned order