Re: Discussion on missing optimizations

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Adam Brusselback <adambrusselback(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Discussion on missing optimizations
Date: 2017-10-09 04:41:20
Message-ID: CAKJS1f_8Fqw2-D4g03=dk5y_MWuGu-QsOVyJZqF_2XqTSppytQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 October 2017 at 14:48, Andres Freund <andres(at)anarazel(dot)de> wrote:
> 3. JOIN Elimination
>
> There's been a lot of discussion and several patches. There's a bunch of
> problems here, one being that there's cases (during trigger firing,
> before the constraint checks) where foreign keys don't hold true, so we
> can't quite generally make these optimization. Another concern is
> whether the added plan time is actually worthwhile.

I looked over this and it seems there's some pretty low hanging fruit
in there that we can add with just a handful of lines of new code.

This is the case for LEFT JOINs with a DISTINCT clause. Normally we
can only remove an unused LEFT JOINed relation if there are some
unique properties that ensure that the join does not duplicate any
outer row. We don't need to worry about this when there's a DISTINCT
clause, as the DISTINCT would remove any duplicates anyway. If I'm not
mistaken, we also don't need to bother looking at the actual distinct
clause's exprs since we'll already know that nothing is in there
regarding the relation being removed. The benefit to this could be
two-fold, as 1) we don't need to join to the unused relation and 2) we
don't need to remove any row duplication caused by the join.

While someone might argue that this is not going to be that common a
case to hit, if we consider how cheap this is to implement, it does
seem to be worth doing a couple of NULL checks in the planner for it.

The only slight downside I can see to this is that we're letting a few
more cases through rel_supports_distinctness() which is also used for
unique joins, and these proofs are not valid in those. However, it may
not be worth the trouble doing anything about that as relations
without unique indexes are pretty uncommon (at least in my world).

Thoughts?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
remove_left_join_distinct.patch application/octet-stream 6.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2017-10-09 04:47:50 Re: Discussion on missing optimizations
Previous Message Pavel Stehule 2017-10-09 04:07:35 Re: is possible cache tupledesc templates in execution plan? significant performance issue, maybe bug?