Re: Performance improvement for joins where outer side is unique

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2016-04-10 03:30:35
Message-ID: CAKJS1f_MWkWqLZvqqp2xzzFhbpq=PBhEO0dK3i32oCUqcdJP-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 April 2016 at 11:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I did some performance testing on the attached somewhat-cleaned-up patch,
> and convinced myself that the planning time penalty is fairly minimal:
> on the order of a couple percent in simple one-join queries, and less
> than that in very large queries. Oddly, it seems that the result cacheing
> done in get_optimal_jointype() is only barely worth the trouble in typical
> cases; though if you get a query large enough to require GEQO, it's a win
> because successive GEQO attempts can re-use cache entries made by earlier
> attempts. (This may indicate something wrong with my testing procedure?
> Seems like diking out the cache should have made more difference.)

It'll really depend on whether you're testing a positive case or a
negative one, since a GEQO join search will be the only time the
non-unique cache is filled, then if you're testing the negative case
then this is the only time you'll get any non-unique cache benefits.
On the other hand, if you were testing with a positive case then I
guess the unique index check is cheaper than we thought. Maybe adding
a couple of other unique indexes before defining the one which proves
the join unique (so that they've got a lower OID) would be enough to
start showing the cache benefits of the unique rel cache.

> I did find by measurement that the negative-cache-entry code produces
> exactly zero hits unless you're in GEQO mode, which is not really
> surprising given the order in which the join search occurs. So in the
> attached patch I made the code not bother with making negative cache
> entries unless using GEQO, to hopefully save a few nanoseconds.

Yeah, I also noticed this in my testing, and it makes sense given how
the standard join search works. Thanks for making that improvement,
I've not yet looked at the code, but it sounds like a good idea.

> I rebased over f338dd758, did a little bit of code cleanup and fixed some
> bugs in the uniqueness detection logic, but have not reviewed the rest of
> the patch since it's likely all gonna change if we reconsider the JoinType
> representation.
>
> Anyway, I think it would be reasonable to give this patch a few more
> days in view of David's being away through the weekend. But the RMT
> has final say on that.

Thanks for making the updates and committing the refactor of
analyzejoins.c. The RMT have ruled no extension will be given, so I'll
get this into shape for 9.7 CF1.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-04-10 04:26:06 Re: Move PinBuffer and UnpinBuffer to atomics
Previous Message David Rowley 2016-04-10 03:00:17 Re: Performance improvement for joins where outer side is unique