Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2016-02-08 02:57:00
Message-ID: CAM3SWZQUEesxa-KvGZzpuozJPN-ue6PtKo=Arh3Oo+Z-7shiSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Feb 7, 2016 at 4:50 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> I'm not even sure this is necessary. The idea of missing out on
>> producing a single sorted run sounds bad but in practice since we
>> normally do the final merge on the fly there doesn't seem like there's
>> really any difference between reading one tape or reading two or three
>> tapes when outputing the final results. There will be the same amount
>> of I/O happening and a 2-way or 3-way merge for most data types should
>> be basically free.
>
> I basically agree with you, but it seems possible to fix the
> regression (generally misguided though those regressed cases are).
> It's probably easiest to just fix it.

On a related note, we should probably come up with a way of totally
supplanting the work_mem model with something smarter in the next
couple of years. Something that treats memory as a shared resource
even when it's allocated privately, per-process. This external sort
stuff really smooths out the cost function of sorts. ISTM that that
makes the idea of dynamic memory budgets (in place of a one size fits
all work_mem) seem desirable for the first time. That said, I really
don't have a good sense of how to go about moving in that direction at
this point. It seems less than ideal that DBAs have to be so
conservative in sizing work_mem.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2016-02-08 03:22:38 Re: GIN pending list clean up exposure to SQL
Previous Message Kouhei Kaigai 2016-02-08 02:49:15 Re: Way to check whether a particular block is on the shared_buffer?