Re: Implement hook for self-join simplification

From: Leif Harald Karlsen <leifhka(at)ifi(dot)uio(dot)no>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Implement hook for self-join simplification
Date: 2022-06-24 18:43:16
Message-ID: 42c704f3d1394fb6a127e6e7eb93265a@ifi.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Andrey,

Thank you for the quick answer, and for the pointer to the patch! This looks like just the thing I need!

On a more general note: What would, in general, be the best way to implement such optimizations? Is there a good way to do this as an extension, or is a patch the preferred way?

Kind regards,
Leif Harald Karlsen
Senior Lecturer
Department of Informatics
University of Oslo
________________________________
From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Sent: 24 June 2022 19:27:50
To: Leif Harald Karlsen; pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Implement hook for self-join simplification

On 24/6/2022 18:58, Leif Harald Karlsen wrote:
> I have a made a small deductive database on top of PostgreSQL for
> educational/research purposes. In this setting, due to certain
> VIEW-constructions, queries often end up being self-joins on primary
> keys, e.g.:
> SELECT t1.id, t2.val
> FROM t AS t1 JOIN t AS t2 USING (id);
>
> where t(id) is a primary key. This query is equivalent to the much more
> efficient:
> SELECT id, val FROM t AS t1;
>
> However, PostgreSQL currently does not seem to implement this
> simplification. Therefore, I have looked into writing an extension that
> performs this, but I am struggling a bit with finding out when this
> simplification should be done, i.e. which hook I should implement.
It is true, but you can use a proposed patch that adds such
functionality [1].

I tried to reproduce your case:
CREATE TABLE t(id int PRIMARY KEY, val text);
explain verbose
SELECT t1.id, t2.val FROM t AS t1 JOIN t AS t2 USING (id);

With this patch you will get a plan:
Seq Scan on public.t t2
Output: t2.id, t2.val
Filter: (t2.id IS NOT NULL)

The approach, implemented in this patch looks better because removes
self-joins on earlier stage than the path generation stage. Feel free to
use it in your research.

[1]
https://www.postgresql.org/message-id/a1d6290c-44e0-0dfc-3fca-66a68b3109ef@postgrespro.ru

--
regards,
Andrey Lepikhov
Postgres Professional

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2022-06-24 18:43:18 Re: pg_upgrade (12->14) fails on aggregate
Previous Message Andrey Borodin 2022-06-24 18:28:31 Re: array_cat anycompatible change is breaking xversion upgrade tests