Re: Timsort performance, quicksort (was: Re: Memory usage during sorting)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timsort performance, quicksort (was: Re: Memory usage during sorting)
Date: 2012-04-24 20:55:18
Message-ID: CA+TgmoY0j_js_qmkqJS+5eFwAQykVrQq9D5kOYp_u6-Ok1+UiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 24, 2012 at 12:30 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Tue, Apr 24, 2012 at 4:17 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Based on that, I'm inclined to propose rejiggering things so that the
>> presorted-input check runs only at the top level, and not during any
>> recursive steps.
>
> Just a thought. What about running only every nth step. Maybe
> something like every 4th step.

If there's actually some advantage to that, sure. But so far, it
looks like less is more.

> But actually I'm confused. This seems to be misguided to me. Quicksort
> isn't stable so even if you have a partially sorted data set the
> recursive partitions are going to be at best partially sorted after a
> pivot.

Exactly. That's why I think we should do it only at the topmost
level, before we've done any pivoting. Doing it at any lower level is
apparently a waste of energy and counterproductive.

> I haven't walked through it but suspect even your
> all-but-one-sorted data set is not finding
> the data sorted in either partition on the next iteration.

I suspect that, too. Actually, I'm a bit confused about why it's
picking such terrible pivots. Our median-of-medians optimization
should be doing better than this, I would think.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-04-24 21:01:29 Re: Timsort performance, quicksort (was: Re: Memory usage during sorting)
Previous Message Robert Haas 2012-04-24 20:51:34 Re: New sync commit mode remote_write