On Sun, Sep 25, 2011 at 10:12 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> [ new results ]
Nice results. I think these are far more convincing than the last
set, because (1) the gains are bigger and (2) they survive -O2 and (3)
you tested an actual query, not just qsort() itself.
I don't want to take time to review this in detail right now, because
I don't think it would be fair to put this ahead of things that were
submitted for the current CommitFest, but I'm impressed.
> On the subject of highly ambitious optimisations to sorting, one
> possibility I consider much more practicable than GPU-accelerated
> sorting is simple threading; quicksort can be parallelised very
> effectively, due to its divide-and-conquer nature. If we could agree
> on a threading abstraction more sophisticated than forking, it's
> something I'd be interested in looking at. To do so would obviously
> entail lots of discussion about how that relates to whatever way we
> eventually decide on implementing parallel query, and that's obviously
> a difficult discussion.
I have the same feeling about about this that I do about almost every
executor optimization that anyone proposes: the whole project would be
entirely simple and relatively painless if it weren't for the need to
make planner changes. I mean, deciding on a threading interface is
likely to be a somewhat contentious discussion, with differences of
opinion on whether we should do it and what the API should look like.
But at the end of the day it's not rocket science, and I expect that
we would end up with something reasonable. What seems much harder is
figuring out how to decide when to perform quicksort in parallel vs.
single-threaded, and how much parallelism would be appropriate. I
haven't seen anyone propose even a shadow of an idea about how to make
such decisions intelligently, either in general or in specific cases.
The other issue is that, while threading would be possibly suitable
for this particular case, at least for built-in datatypes with
comparison operations that basically reduce to single machine-language
comparison instructions, it's hard to see how we could take it much
further. It would be unsafe for these multiple threads of execution
to do anything that could possibly throw an error or anything that
touches a lightweight lock or, really, just about anything at all.
Trying to make the entire backend thread-safe - or even any
significant portion of it - seems like a colossal effort that will
most likely fail, but maybe not without eating an enormous amount of
developer time first. And without doing that, I don't think we could
even extend this as far as, say, numeric, whose functions do things
like palloc() and ereport() internally. So I feel like this whole
approach might be a dead-end - there's a narrow range of cases where
it could be made to work, I think, but after that I think you hit a
The Enterprise PostgreSQL Company
In response to
pgsql-hackers by date
|Next:||From: Robert Haas||Date: 2011-09-26 03:49:46|
|Subject: Re: contrib/sepgsql regression tests are a no-go|
|Previous:||From: Robert Haas||Date: 2011-09-26 03:22:56|
|Subject: Re: [v9.2] Fix Leaky View Problem|