Re: Removing unneeded self joins

From: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2019-03-20 12:20:35
Message-ID: 552e481b-2feb-75fd-4e9f-4199bfd1c1f3@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/14/19 14:21, David Rowley wrote:

> What do you think?

Let's recap the conditions when we can remove a self-join. It is when
for each outer row, 1) at most one inner row matches the join clauses,
and 2) it is the same row as the outer one. I'm not sure what (2) means
precisely in a general case, but for a plain table, we can identify
these rows by ctid. So when both sides have the same unique index with
the same clauses, we conclude that we are always dealing with the same
row (as identified by ctid) on both sides, hence the join can be
replaced with a scan.

The code I wrote just checks for the above conditions. The data we need
for these checks is a byproduct of checking the relations for
uniqueness, which we do anyway, so we just cache it for a negligible cost.

I didn't write it in a more generic way because I don't understand the
conditions for generic case. In your DISTINCT example, the join can be
removed indeed. But if we select some columns from the inner side apart
from the join ones, we can't remove the join anymore:

select * from t1, (select distinct on (a) a, b from t1) tt where t1.a =
tt.a;

I think this might be a different kind of optimization, where we remove
the self-join if the inner side is unique, and no inner columns are
selected besides the join ones.

Also, reading your letter I realized that I don't commute the index
clauses correctly before comparing them in is_unique_self_join, so I
fixed this in the new version of the patch.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v12-0001-Remove-unique-self-joins.patch text/x-patch 62.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2019-03-20 12:29:35 Re: Add exclusive backup deprecation notes to documentation
Previous Message David Rowley 2019-03-20 11:51:08 Re: Should we add GUCs to allow partition pruning to be disabled?