From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)heroku(dot)com> |
Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: 9.5: Better memory accounting, towards memory-bounded HashAgg |
Date: | 2014-12-30 16:53:43 |
Message-ID: | CAMkU=1wguPocxMU+2dcLctrVFJgA4hB-EP8=xxB6JhA6CgsccA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 28, 2014 at 12:45 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Sun, Dec 28, 2014 at 12:37 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > Do others have similar numbers? I'm quite surprised at how little
> > work_mem seems to matter for these plans (HashJoin might be a different
> > story though). I feel like I made a mistake -- can someone please do a
> > sanity check on my numbers?
>
> I have seen external sorts that were quicker than internal sorts
> before. With my abbreviated key patch, under certain circumstances
> external sorts are faster, while presumably the same thing is true of
> int4 attribute sorts today. Actually, I saw a 10MB work_mem setting
> that was marginally faster than a multi-gigabyte one that fit the
> entire sort in memory. It probably has something to do with caching
> effects dominating over the expense of more comparisons, since higher
> work_mem settings that still resulted in an external sort were slower
> than the 10MB setting.
>
> I was surprised by this too, but it has been independently reported by
> Jeff Janes.
>
I don't recall (at the moment) seeing our external sort actually faster
than quick-sort, but I've very reliably seen external sorts get faster with
less memory than with more. It is almost certainly a CPU caching issue.
Very large simple binary heaps are horrible on the CPU cache. And for
sort-by-reference values, quick sort is also pretty bad.
With a slow enough data bus between the CPU and main memory, I don't doubt
that a 'tapesort' with small work_mem could actually be faster than
quicksort with large work_mem. But I don't recall seeing it myself. But
I'd be surprised that a tapesort as currently implemented would be faster
than a quicksort if the tapesort is using just one byte less memory than
the quicksort is.
But to Jeff Davis's question, yes, tapesort is not very sensitive to
work_mem, and to the extent it is sensitive it is in the other direction of
more memory being bad. Once work_mem is so small that it takes multiple
passes over the data to do the merge, then small memory would really be a
problem. But on modern hardware you have to get pretty absurd settings
before that happens.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2014-12-30 16:59:17 | Re: Documentation of bt_page_items()'s ctid field |
Previous Message | David Johnston | 2014-12-30 16:21:39 | Re: [HACKERS] ON_ERROR_ROLLBACK |