Re: Performance improvement for joins where outer side is unique

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2015-03-20 08:11:25
Message-ID: CAApHDvpfA9wj_13VzfiqydcPszzpvBkPVmK8Wdr7EnC=svGdBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20 March 2015 at 16:11, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

>
> I think this satisfies your wish and implemented in non
> exhaustive-seearch-in-jointree manner. It still don't have
> regressions for itself but I don't see siginificance in adding
> it so much...
>
>
This seems quite a bit better. Having the inner_unique variable as part of
the JoinPath struct seems much better than what I had. This seems to remove
the requirement of my patch that all joins to that RelOptInfo be unique.

I also quite like the more recent change to make_hashjoin and co. just pass
the JoinPath as a parameter.

I don't really like the "(inner unique)" being tagged onto the end of the
join node, but there's not much point in spending too much time talking
about that right now. There's much better things to talk about. I'm sure we
can all bikeshed around that one later.

In joinpath.c you have a restriction to only perform the unique check for
inner joins.. This should work ok for left joins too, but it would probably
be more efficient to have the left join removal code analyse the
SpecialJoinInfos during checks for left join removals. I think it would
just be a matter of breaking down the join removal code similar to how I
did in my patch, but this time add a bool inner_unique to the
SpecialJoinInfo struct. The join_is_legal() function seems to select the
correct SpecialJoinInfo if one exists, so add_paths_to_joinrel() shouldn't
need to call relation_has_unique_index_for() if it's a LEFT JOIN, as we'll
already know if it's unique by just looking at the property.

You've also lost the ability to detect that subqueries are unique:

create table j1(id int primary key);
create table j2(value int not null);
explain select * from j1 inner join (select distinct value from j2) j2 on
j1.id=j2.value;

The left join removal code properly detects this, so I think unique joins
should too.

I can continue working on your patch if you like? Or are you planning to go
further with it?

Regards
David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-03-20 08:33:07 Re: WIP: multivariate statistics / proof of concept
Previous Message Tatsuo Ishii 2015-03-20 04:10:27 Re: ERRCODE_T_R_DEADLOCK_DETECTED