Re: Performance improvement for joins where outer side is unique

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2017-01-29 21:34:00
Message-ID: CAKJS1f8670hKSaC8FhzaURu83kMr9EkLCzRJK0=sqqq-BJSAHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28 January 2017 at 05:04, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> I agree that special handling of one join type is not so pretty.
>> However, LEFT JOINs still remain a bit special as they're the only
>> ones we currently perform join removal on, and the patch modifies that
>> code to make use of the new flag for those. This can improve planner
>> performance of join removal when a join is removed successfully, as
>> the previous code had to recheck uniqueness of each remaining LEFT
>> JOIN again, whereas the new code only checks uniqueness of ones not
>> previously marked as unique. This too likely could be done with the
>> cache, although I'm a bit concerned with populating the cache, then
>> performing a bunch of LEFT JOIN removals and leaving relids in the
>> cache which no longer exist. Perhaps it's OK. I've just not found
>> proofs in my head yet that it is.
>
> TBH, I do not like that tie-in at all. I don't believe that it improves
> any performance, because if analyzejoins.c detects that the join is
> unique, it will remove the join; therefore there is nothing to cache.
> (This statement depends on the uniqueness test being the last removability
> test, but it is.) And running mark_unique_joins() multiple times is ugly
> and adds cycles whenever it cannot prove a join unique, because it'll keep
> trying to do so. So I'm pretty inclined to drop the connection to
> analyzejoins.c altogether, along with mark_unique_joins(), and just use
> the generic positive/negative cache mechanism you added for all join types.

I can make this change, but before I do I just want to point that I
don't think what you've said here is entirely accurate.

Let's assume unique joins are very common place, and join removals are
not so common. If a query has 5 left joins, and only one of which can
be removed, then the new code will most likely perform 5 unique join
checks, whereas the old code would perform 9, as those unique checks
are performed again once the 1 relation is removed for the remaining
4.

However I'll go make the change as something needs fixed in that area
anyway, as LEFT JOINs use the additional quals, whereas other join
types don't, which is broken.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2017-01-29 21:49:18 Re: \if, \elseif, \else, \endif (was Re: PSQL commands: \quit_if, \quit_unless)
Previous Message Andres Freund 2017-01-29 21:12:19 Re: Create a separate test file for exercising system views