Re: Performance improvement for joins where outer side is unique

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2016-03-08 23:57:09
Message-ID: CAKJS1f-V7xEh6+YHrCRNaZzbk7mbgPvi9MdXTCD8eJ=ROuq2Mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23 January 2016 at 05:36, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> On 12/17/2015 02:17 PM, David Rowley wrote:
>>
>> On 17 December 2015 at 19:11, Simon Riggs <simon(at)2ndquadrant(dot)com
>> <mailto:simon(at)2ndquadrant(dot)com>> wrote:
>>
>> On 17 December 2015 at 00:17, Tomas Vondra
>> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>>
>> wrote:
>>
>> I'd go with match_first_tuple_only.
>>
>>
>> +1
>>
>> unique_inner is a state that has been detected,
>> match_first_tuple_only is the action we take as a result.
>>
>>
>> Ok great. I've made it so in the attached. This means the comment in the
>> join code where we perform the skip can be a bit less verbose and all
>> the details can go in where we're actually setting the
>> match_first_tuple_only to true.
>
>
> OK. I've looked at the patch again today, and it seems broken bv 45be99f8 as
> the partial paths were not passing the unique_inner to the create_*_path()
> functions. The attached patch should fix that.
>
> Otherwise I think the patch is ready for committer - I think this is a
> valuable optimization and I see nothing wrong with the code.

I've attached an updated patch which updates it to fix the conflicts
with the recent upper planner changes.
I also notice that some regression tests, which I think some of which
Tom updated in the upper planner changes have now changed back again
due to the slightly reduced costs on hash and nested loop joins where
the inner side is unique. I checked the costs of one of these by
disabling hash join and noticed that the final totla cost is the same,
so it's not too surprising that they keep switching plans with these
planner changes going in. I verified that these remain as is when I
comment out the cost changing code in this patch.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
unique_joins_dbcecda_2016-03-09.patch application/octet-stream 78.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-03-09 00:09:10 Re: pgcrypto: add s2k-count
Previous Message Joshua D. Drake 2016-03-08 23:32:01 Re: silent data loss with ext4 / all current versions