Re: Inlining comparators as a performance optimisation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inlining comparators as a performance optimisation
Date: 2011-11-19 02:55:54
Message-ID: CA+TgmoZoVmS+Th=xz9tNTbLtmf1nv8UG52JaVsMbXkPAnXH=NA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 18, 2011 at 4:38 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> I understand that we highly value extensibility and genericity (yes,
> that's a real word). We may not always be well served by that
> tendency.

True (except that genericity is not a synonym for generality AFAICT).
A good fraction of the optimizations that we've done over the last,
well, pick any arbitrary period of time consists in adding special
cases that occur frequently enough to be worth optimizing - e.g. HOT,
or fast relation locks - often to extremely good effect.

> Firstly, 1/4 of a quite large gain is still a pretty good gain.
> Secondly, I probably didn't actually isolate the effects of inlining,
> nor the overall benefit of the compiler knowing the comparator at
> compile-time. I just removed the inline keyword.

Maybe we should look at trying to isolate that a bit better.

It strikes me that we could probably create an API that would support
doing either of these things depending on the wishes of the underlying
datatype. For example, imagine that we're sorting with <(int4, int4).
We associate a PGPROC-callable function with that operator that
returns "internal", really a pointer to a struct. The first element
of the struct is a pointer to a comparison function that qsort() (or a
tape sort) can invoke without a trampoline; the second is a wholesale
replacement for qsort(); either or both can be NULL. Given that, it
seems to me that we could experiment with this pretty easily, and if
it turns out that only one of them is worth doing, it's easy to drop
one element out of the structure.

Or do you have another plan for how to do this?

> Fair enough, but it's not the only test I did - I posted other numbers
> for the same query when the table was 48mb, and we saw a proportional
> improvement, consistent with a per-comparison win. I'm supposed to be
> on leave for a few days at the moment, so I won't be very active this
> weekend, but I'm rather curious as to where you or others would like
> to see me go with benchmarks.
>
> I should point out that we currently don't have much idea how big of a
> win applying these principles could be for index creation times...it
> could possibly be very significant. My profiling of index creation
> makes this looks promising.

Have you done any benchmarks where this saves seconds or minutes,
rather than milliseconds? That would certainly make it more exciting,
at least to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2011-11-19 03:08:16 Re: testing ProcArrayLock patches
Previous Message Andres Freund 2011-11-19 02:12:52 Re: EXPLAIN (plan off, rewrite off) for benchmarking