Re: Performance improvement for joins where outer side is unique

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2016-04-07 23:59:31
Message-ID: 10298.1460073571@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

I wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> FWIW the feature freeze rules state that it is allowed for a committer
>> to request an extension to the feature freeze date for individual
>> patches:
>> https://www.postgresql.org/message-id/CA%2BTgmoY56w5FOzeEo%2Bi48qehL%2BBsVTwy-Q1M0xjUhUCwgGW7-Q%40mail.gmail.com
>> It seems to me that the restrictions laid out there are well met for
>> this patch, if you only need a couple of additional days for this patch
>> to get in.

> Hmm ... the changes I'm thinking about here are certainly pretty
> mechanical, if tedious. The main question mark I have hanging over
> this patch is whether the planning-time penalty is too large --- but
> that's something that can be tested with the patch as it stands.
> Let me go investigate that a bit before requesting an extension.

I did some performance testing on the attached somewhat-cleaned-up patch,
and convinced myself that the planning time penalty is fairly minimal:
on the order of a couple percent in simple one-join queries, and less
than that in very large queries. Oddly, it seems that the result cacheing
done in get_optimal_jointype() is only barely worth the trouble in typical
cases; though if you get a query large enough to require GEQO, it's a win
because successive GEQO attempts can re-use cache entries made by earlier
attempts. (This may indicate something wrong with my testing procedure?
Seems like diking out the cache should have made more difference.)

I did find by measurement that the negative-cache-entry code produces
exactly zero hits unless you're in GEQO mode, which is not really
surprising given the order in which the join search occurs. So in the
attached patch I made the code not bother with making negative cache
entries unless using GEQO, to hopefully save a few nanoseconds.

I rebased over f338dd758, did a little bit of code cleanup and fixed some
bugs in the uniqueness detection logic, but have not reviewed the rest of
the patch since it's likely all gonna change if we reconsider the JoinType
representation.

Anyway, I think it would be reasonable to give this patch a few more
days in view of David's being away through the weekend. But the RMT
has final say on that.

regards, tom lane

Attachment Content-Type Size
unique_joins_2016-04-07-2.patch text/x-diff 75.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Pontis 2016-04-08 00:16:06 Re: [GENERAL] pg_restore casts check constraints differently
Previous Message Tom Lane 2016-04-07 23:33:41 Re: GIN data corruption bug(s) in 9.6devel