Re: Inlining comparators as a performance optimisation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inlining comparators as a performance optimisation
Date: 2011-09-26 03:46:29
Message-ID: CA+TgmoaB75LmGUDAoLe-eH_Bedmak=C9MxBr34WMOA3if4N0RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 25, 2011 at 10:12 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> [ new results ]

Nice results. I think these are far more convincing than the last
set, because (1) the gains are bigger and (2) they survive -O2 and (3)
you tested an actual query, not just qsort() itself.

I don't want to take time to review this in detail right now, because
I don't think it would be fair to put this ahead of things that were
submitted for the current CommitFest, but I'm impressed.

> On the subject of highly ambitious optimisations to sorting, one
> possibility I consider much more practicable than GPU-accelerated
> sorting is simple threading; quicksort can be parallelised very
> effectively, due to its divide-and-conquer nature. If we could agree
> on a threading abstraction more sophisticated than forking, it's
> something I'd be interested in looking at. To do so would obviously
> entail lots of discussion about how that relates to whatever way we
> eventually decide on implementing parallel query, and that's obviously
> a difficult discussion.

I have the same feeling about about this that I do about almost every
executor optimization that anyone proposes: the whole project would be
entirely simple and relatively painless if it weren't for the need to
make planner changes. I mean, deciding on a threading interface is
likely to be a somewhat contentious discussion, with differences of
opinion on whether we should do it and what the API should look like.
But at the end of the day it's not rocket science, and I expect that
we would end up with something reasonable. What seems much harder is
figuring out how to decide when to perform quicksort in parallel vs.
single-threaded, and how much parallelism would be appropriate. I
haven't seen anyone propose even a shadow of an idea about how to make
such decisions intelligently, either in general or in specific cases.

The other issue is that, while threading would be possibly suitable
for this particular case, at least for built-in datatypes with
comparison operations that basically reduce to single machine-language
comparison instructions, it's hard to see how we could take it much
further. It would be unsafe for these multiple threads of execution
to do anything that could possibly throw an error or anything that
touches a lightweight lock or, really, just about anything at all.
Trying to make the entire backend thread-safe - or even any
significant portion of it - seems like a colossal effort that will
most likely fail, but maybe not without eating an enormous amount of
developer time first. And without doing that, I don't think we could
even extend this as far as, say, numeric, whose functions do things
like palloc() and ereport() internally. So I feel like this whole
approach might be a dead-end - there's a narrow range of cases where
it could be made to work, I think, but after that I think you hit a
titanium wall.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-09-26 03:49:46 Re: contrib/sepgsql regression tests are a no-go
Previous Message Robert Haas 2011-09-26 03:22:56 Re: [v9.2] Fix Leaky View Problem