Re: Removing unneeded self joins

From: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
Cc: "Gregory Stark (as CFM)" <stark(dot)cfm(at)gmail(dot)com>, Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Subject: Re: Removing unneeded self joins
Date: 2023-10-11 03:51:26
Message-ID: b2f522c6-c150-4f1f-a722-f70e9a75fe13@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/10/2023 02:29, Alena Rybakina wrote:
> I have reviewed your patch and I noticed a few things.

Thanks for your efforts,

> I have looked at the latest version of the code, I assume that the error
> lies in the replace_varno_walker function, especially in the place where
> we check the node by type Var, and does not form any NullTest node.

It's not a bug, it's an optimization we discussed with Alexander above.

> Secondly, I added some code in some places to catch erroneous cases and
> added a condition when we should not try to apply the self-join-removal
> transformation due to the absence of an empty self-join list after
> searching for it and in general if there are no joins in the query.
> Besides, I added a query for testing and wrote about it above. I have
> attached my diff file.
Ok, I will look at this
> In addition, I found a comment for myself that was not clear to me. I
> would be glad if you could explain it to me.
>
> You mentioned superior outer join in the comment, unfortunately, I
> didn't find anything about it in the PostgreSQL code, and this
> explanation remained unclear to me. Could you explain in more detail
> what you meant?
I meant here that only clauses pushed by reconsider_outer_join_clauses()
can be found in the joininfo list, and they are not relevant, as you can
understand.
Having written that, I realized that it was a false statement. ;) -
joininfo can also contain results of previous SJE iterations, look:

CREATE TABLE test (oid int PRIMARY KEY);
CREATE UNIQUE INDEX ON test((oid*oid));
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c2.oid AND c1.oid*c2.oid=c3.oid*c3.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c1.oid=c3.oid AND c1.oid*c3.oid=c2.oid*c2.oid;
explain
SELECT count(*)
FROM test c1, test c2, test c3
WHERE c3.oid=c2.oid AND c3.oid*c2.oid=c1.oid*c1.oid;

Having executed this SQL code, you could see that in the last query, the
SJE feature didn't delete one of the JOINs because of the reason I had
written above.
It's not an one-minute fix - I will try to propose solution a bit later.

--
regards,
Andrey Lepikhov
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-10-11 04:02:14 Re: stopgap fix for signal handling during restore_command
Previous Message Peter Geoghegan 2023-10-11 03:51:10 Re: interval_ops shall stop using btequalimage (deduplication)