Re: Memory usage during sorting

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory usage during sorting
Date: 2012-03-20 20:00:14
Message-ID: CA+TgmoY6ZnS1HZF68-Lcv1RHVbP9V-uE5JWMeJBrLuEmJ2y+Sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 20, 2012 at 3:41 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Tue, Mar 20, 2012 at 5:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> No.  It does the opposite: it slows it down.  This is a highly
>> surprising result but it's quite repeatable: removing comparisons
>> makes it slower.  As previously pontificated, I think this is probably
>> because the heap can fill up with next-run tuples that are cheap to
>> compare against, and that spares us having to do "real" comparisons
>> involving the actual datatype comparators.
>
> Frankly that analysis didn't make any sense to me at the time.
> Comparing integers is fast, sure, but it's still slower than not
> having to do any comparison at all.

I think you're underestimating how much it costs to call the
datatype-specific comparator. My belief is that it's wicked
expensive. The COMPARETUP() macro extracts a function pointer from
the Tuplesortstate and calls it; we end up comparetup_heap, which
calls ApplySortComparator(), which pulls the comparator function out
of the state and then calls that. Since I was sorting strings, which
have no sortsupport, we then end up in comparison_shim(), which uses
the FunctionCallInvoke method to extract the actual function pointer
and jump into bttextcmp(), which unpacks its arguments and then calls
text_cmp(), which unpacks its arguments some more and then calls
varstr_cmp() where the actual work happens. That's not trivial either
- we have to call lc_collate_is_c() and then memcmp(). I have no
problem believing that 6 levels of function calls, 3 of which involve
jumps through function pointers, followed by lc_collate_is_c() and
memcmp() is 100x or more as expensive than the lone integer comparison
that happens when the tupindex values don't match - that's a single
instruction.

> Fwiw I think more interesting than improving tapesort would be
> implementing wholly different algorithms like radix sort or the
> repeated quicksort. Being able to handle different algorithms that
> require a different API would be the first step to being able to
> handle parallel algorithms using threads or GPUs.

Yeah, I think I'm going to try implementing
quicksort-the-whole-batch-and-dump-it-out-as-a-run algorithm, just to
see how good or bad that is compared to what we have now. We may not
end up doing anything that remotely resembles that, in the end, but I
want to see the numbers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message MUHAMMAD ASIF 2012-03-20 20:15:20 Re: vacuumlo issue
Previous Message Alexander LAW 2012-03-20 19:50:14 Re: BUG #6510: A simple prompt is displayed using wrong charset