Re: Inlining comparators as a performance optimisation

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Inlining comparators as a performance optimisation
Date: 2011-11-29 21:36:26
Message-ID: 201111292236.27140.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, November 29, 2011 07:48:37 PM Peter Geoghegan wrote:
> On 29 November 2011 15:31, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > These are exciting advanced you are producing and I am hopeful we can
> > get this included in Postgres 9.2.
>
> Thanks Bruce.
>
> >I have mentioned already that I
> >
> > think parallelism is the next big Postgres challenge, and of course, one
> > of the first areas for parallelism is sorting.
>
> I'm not sure that sorting has that much to recommend it as an initial
> target of some new backend parallelism other than being easy to
> implement. I've observed the qsort_arg specialisations in this patch
> out-perform stock qsort_arg by as much as almost 3 times. However, the
> largest decrease in a query's time that I've observed was 45%, and
> that was for a contrived worst-case for quicksort, but about 25% is
> much more typical of queries similar to the ones I've shown, for more
> normative data distributions. While that's a respectable gain, it
> isn't a paradigm shifting one, and it makes parallelising qsort itself
> for further improvements quite a lot less attractive - there's too
> many other sources of overhead.
I think that logic is faulty.

For one I doubt that anybody is honestly suggesting paralellism inside qsort
itself. It seems more likely/sensible to implement that on the level of
mergesorting.
Index builds for example could hugely benefit from improvements on that level.
With index build you often get pretty non-optimal data distributions btw...

I also seriously doubt that you will find an area inside pg's executor where
you find that paralellizing them will provide a near linear scale without
much, much more work.

Also I wouldn't consider sorting the easiest target - especially on a qsort
level - for parallelization as you constantly need to execute user defined
operators with multiple input tuples which has the usual problems.
COPY parsing + inserting or such seems to be way easier target for example.
Even doing hashing + aggregation in different threads seems likely to be
easier.

Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Jaskiewicz 2011-11-29 21:43:40 Re: Inlining comparators as a performance optimisation
Previous Message Bruce Momjian 2011-11-29 21:35:18 Re: Allow pg_dumpall to use dumpmem.c functions, simplify exit code