Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2016-02-08 00:50:44
Message-ID: CAM3SWZS8Ppe3pC8HJSKc+6HgOxUseBDm-d7uoaKzT1PdH6wb=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 7, 2016 at 10:51 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> > You don't want to change the behavior of the current patch for the
>> > second or subsequent run; that should remain a quicksort, pure and
>> > simple. Do I have that right?
>>
>> Yes.
>
> I'm not even sure this is necessary. The idea of missing out on
> producing a single sorted run sounds bad but in practice since we
> normally do the final merge on the fly there doesn't seem like there's
> really any difference between reading one tape or reading two or three
> tapes when outputing the final results. There will be the same amount
> of I/O happening and a 2-way or 3-way merge for most data types should
> be basically free.

I basically agree with you, but it seems possible to fix the
regression (generally misguided though those regressed cases are).
It's probably easiest to just fix it.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-02-08 00:53:20 Re: GIN pending list clean up exposure to SQL
Previous Message Kouhei Kaigai 2016-02-08 00:28:45 Re: CustomScan in a larger structure (RE: CustomScan support on readfuncs.c)