Re: Performance improvement for joins where outer side is unique

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2016-04-06 21:10:20
Message-ID: CAKJS1f9-dq_D3K9RV0PJyOib2iV5pd8Lj5pRCDwA8w-yeXCsxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 April 2016 at 08:01, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> On 7 April 2016 at 04:05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Starting to look at this again. I wonder, now that you have the generic
>> caching mechanism for remembering whether join inner sides have been
>> proven unique, is it still worth having the is_unique_join field in
>> SpecialJoinInfo? It seems like that's creating a separate code path
>> for special joins vs. inner joins that may not be buying us much.
>> It does potentially save lookups in the unique_rels cache, if you already
>> have the SpecialJoinInfo at hand, but I'm not sure what that's worth.
>
> I quite like that field where it is, as it should make
> remove_useless_joins() a bit more efficient, as after a LEFT JOIN is
> removed, the previous code would go off and try to make sure all the
> joins are unique again, but now we cache that, and save it from having
> to bother doing that again, on joins already marked as unique.
>
> Certainly changing that would mean one less special case in
> joinpath.c, as the JOIN_LEFT case can be handle the same as the other
> cases, although it looks like probably, if I do change that, then I'd
> probably move is_innerrel_unique_for() into analyzejoins.c, and put
> the special case for JOIN_LEFT in that function, so that it calls
> specialjoin_is_unique_join(), then cache the sjinfo->min_righthand in
> the unique_rels cache if the result comes back positive, and in the
> non_unique_rels cache if negative... But it seems a bit crazy to go to
> the trouble or all that caching, when we can just throw the result in
> a struct field in the case of Special Joins. Maybe we could just hide
> both the new joinpath.c functions in analyzejoins.c and call it quits.
> It's not as if there's no special cases for JOIN_LEFT in that file.

We could also get rid of the SpecialJoinInfo.is_unique_join and just
store this as optimal_jointype, where this would be initialised to
jointype in make_outerjoininfo(), and then set in mark_unique_joins().
This would simplify the test in get_optimal_jointype(), perhaps if
(IS_OUTER_JOIN(jointype)) return sjinfo->optimal_jointype;

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-04-06 21:10:31 Re: [PATCH v12] GSSAPI encryption support
Previous Message Peter Geoghegan 2016-04-06 20:52:38 Re: WIP: Covering + unique indexes.