Re: Removing unneeded self joins

From: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing unneeded self joins
Date: 2019-02-28 14:09:34
Message-ID: 2b74af78-3417-167d-ad72-d54ec558d8be@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom,

Thanks for the update.

On 2/22/19 03:25, Tom Lane wrote:
> * My compiler was bitching about misplaced declarations, so I moved
> some variable declarations accordingly. I couldn't help noticing
> that many of those wouldn't have been a problem in the first place
> if you were following project style for loops around list_delete_cell
> calls, which usually look more like this:

Changed them to conform to project style. After a couple of segfaults I
remembered why I wrote them the way I did: you can't use plain
'continue' with our style of loops and also have to update prev, and I
can't seem to remember to do that.

> I notice though that there's one unexplained plan change remaining
> in join.out:
>
> @@ -4365,11 +4365,13 @@ explain (costs off)
> select p.* from
> (parent p left join child c on (p.k = c.k)) join parent x on p.k = x.k
> where p.k = 1 and p.k = 2;
> - QUERY PLAN
> ---------------------------
> + QUERY PLAN
> +------------------------------------------------
> Result
> One-Time Filter: false
> -(2 rows)
> + -> Index Scan using parent_pkey on parent x
> + Index Cond: (k = 1)
> +(4 rows)
>
> -- bug 5255: this is not optimizable by join removal
> begin;
>
> That sure looks like a bug. I don't have time to look for the
> cause right now.

David also asked about this before. This is the same plan you'd get for
'select * from parent where k = 1 and k = 2', and this plan is exposed
by join removal. So this is not a bug in join removal itself.

> I also noticed that the test results show that when a table
> is successfully optimized away, the remaining reference seems
> to have the alias of the second reference not the first one.
> That seems a little ... weird. It's just cosmetic of course, but
> why is that?

Can you point me to a query that looks wrong?

> Also, I did notice that you'd stuck a declaration for
> "struct UniqueIndexInfo" into paths.h, which then compelled you
> to include that header in planmain.h. This seems like poor style;
> I'd have been inclined to put the struct in pathnodes.h instead.

Moved.

> I wonder why you didn't include the Relids into UniqueIndexInfo as well
> ... and maybe make it a proper Node so that unique_for_rels could be
> printed by outfuncs.c.

We also prove and cache the uniqueness for subqueries, for which the
UniqueIndexInfo is not relevant, that's why it was optional and stored
in a parallel list. Now I changed it to UniqueRelInfo which always has
outerrelids and optionally the unique index.

I also fixed a bug with not updating the references in HAVING clause.
New version is attached.

--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
v11-0001-Remove-self-joins-on-a-unique-column.patch text/x-patch 62.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-02-28 14:12:25 Re: Row Level Security − leakproof-ness and performance implications
Previous Message Dean Rasheed 2019-02-28 14:07:46 Re: BUG #15623: Inconsistent use of default for updatable view