Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu>
Subject: Re: Using quicksort for every external sort run
Date: 2015-12-18 21:10:43
Message-ID: CAM3SWZRyHad4+uiGejNvQk26SiE=BBXLDyVF4vf005Y3Oh2sCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 18, 2015 at 12:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> BTW, I'm not necessarily determined to make the new special-purpose
>> allocator work exactly as proposed. It seemed useful to prioritize
>> simplicity, and currently so there is one big "huge palloc()" with
>> which we blow our memory budget, and that's it. However, I could
>> probably be more clever about "freeing ranges" initially preserved for
>> a now-exhausted tape. That kind of thing.
>
> What about the case where we think that there will be a lot of data
> and have a lot of work_mem available, but then the user sends us 4
> rows because of some mis-estimation?

The memory patch only changes the final on-the-fly merge phase. There
is no estimate involved there.

I continue to use whatever "slots" (memtuples) are available for the
final on-the-fly merge. However, I allocate all remaining memory that
I have budget for at once. My remarks about the efficient use of that
memory was only really about each tape's use of their part of that
over time.

Again, to emphasize, this is only for the final on-the-fly merge phase.

>> With the on-the-fly merge memory patch, I'm improving locality of
>> access (for each "tuple proper"/"tuple itself"). If I also happen to
>> improve the situation around palloc() fragmentation at the same time,
>> then so much the better, but that's clearly secondary.
>
> I don't really understand this comment.

I just mean that I wrote the memory patch with memory locality in
mind, not palloc() fragmentation or other overhead.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Seltenreich 2015-12-18 21:23:13 [sqlsmith] Failing assertions in spgtextproc.c
Previous Message Robert Haas 2015-12-18 20:55:10 Re: Refactoring speculative insertion with unique indexes a little