Re: [HACKERS] Removing LEFT JOINs in more cases

From: Antonin Houska <ah(at)cybertec(dot)at>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Removing LEFT JOINs in more cases
Date: 2018-09-18 08:02:49
Message-ID: 12936.1537257769@linux-at0a
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:

> I've attached an updated version of this patch.

Following are the findings from my review.

On the LATERAL references:

This query (proposed upthread by Tom and adjusted by me so it can be executed
on the your test tables)

select distinct t1.id from t1 left join t2 on t1.id=t2.b, lateral nextval('foo');

is subject to join removal because in rel_distinctified_after_join() you check
brel->lateral_vars, which is NIL in this case.

On the other hand I'm not sure that user should expect any number of
executions of the function in this particular case because the actual join
order can change: it can happen that the function scan is first joined to t1
and t2 is joined to the result. The case that requires more caution is that
the function references both t1 and t2, i.e. *all* tables of the left join
from which you want to remove the inner relation. Maybe that's the reason you
introduced the test for brel->lateral_vars, but you forgot to check the actual
variables in the list.

On the volatile function in the targetlist:

Volatile function in the DISTINCT ON expression isn't necessarily noticed by
your checks:

select distinct on (nextval('foo')) t1.id from t1 left join t2 on t1.id=t2.b;

Perhaps you should simply check the whole targetlist if there's no GROUP BY
clause.

On coding:

* join_is_removable(): Just like rel_distinctified_after_join() is called
before rel_is_distinct_for() - obviously because it's cheaper - I suggest
to call query_distinctifies_rows() before rel_supports_distinctness().

* header comment of query_distinctifies_rows() mentions rel_tuples_needed,
but I can't find such a function.

* rel_distinctified_after_join() does not really use the "rel"
argument. Besides removing it you probably should rename the function.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2018-09-18 08:27:16 RE: Changing the setting of wal_sender_timeout per standby
Previous Message Alexander Korotkov 2018-09-18 07:51:28 Re: Difference in TO_TIMESTAMP results.