Re: [PERFORM] A Better External Sort?

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Ron Peacetree" <rjpeace(at)earthlink(dot)net>, pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] A Better External Sort?
Date: 2005-09-30 04:22:27
Message-ID: BF620B93.10588%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Jeff,

On 9/29/05 10:44 AM, "Jeffrey W. Baker" <jwbaker(at)acm(dot)org> wrote:

> On Thu, 2005-09-29 at 10:06 -0700, Luke Lonergan wrote:
> Looking through tuplesort.c, I have a couple of initial ideas. Are we
> allowed to fork here? That would open up the possibility of using the
> CPU and the I/O in parallel. I see that tuplesort.c also suffers from
> the kind of postgresql-wide disease of calling all the way up and down a
> big stack of software for each tuple individually. Perhaps it could be
> changed to work on vectors.

Yes!

> I think the largest speedup will be to dump the multiphase merge and
> merge all tapes in one pass, no matter how large M. Currently M is
> capped at 6, so a sort of 60GB with 1GB sort memory needs 13 passes over
> the tape. It could be done in a single pass heap merge with N*log(M)
> comparisons, and, more importantly, far less input and output.

Yes again, see above.

> I would also recommend using an external processes to asynchronously
> feed the tuples into the heap during the merge.

Simon Riggs is working this idea a bit - it's slightly less interesting to
us because we already have a multiprocessing executor. Our problem is that
4 x slow is still far too slow.

> What's the timeframe for 8.2?

Let's test it out in Bizgres!

- Luke

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ron Peacetree 2005-09-30 05:24:30 Re: [PERFORM] A Better External Sort?
Previous Message Ron Peacetree 2005-09-30 02:57:19 Re: [PERFORM] A Better External Sort?

Browse pgsql-performance by date

  From Date Subject
Next Message Ron Peacetree 2005-09-30 05:24:30 Re: [PERFORM] A Better External Sort?
Previous Message Ron Peacetree 2005-09-30 02:57:19 Re: [PERFORM] A Better External Sort?