Re: B-Tree support function number 3 (strxfrm() optimization)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Date: 2014-09-12 18:38:58
Message-ID: CA+TgmoYh=dcUeHrRiO1aWma6-593DvB32gyqv2+VhB0EP-Pg3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 12, 2014 at 5:28 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 09/12/2014 12:46 AM, Peter Geoghegan wrote:
>>
>> On Thu, Sep 11, 2014 at 1:50 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
>> wrote:
>>>
>>> I think I said pretty clearly that it was.
>>
>>
>> I agree that you did, but it wasn't clear exactly what factors you
>> were asking me to simulate.
>
>
> All factors.
>
>> Do you want me to compare the same string a million times in a loop,
>> both with a strcoll() and with a memcmp()?
>
>
> Yes.
>
>> Should I copy it into a buffer to add a NUL byte?
>
>
> Yes.
>
>> Or should it be a new string each time, with a cache miss expected
>> some proportion of the time?
>
>
> Yes.
>
> I'm being facetious - it's easy to ask for tests when you're not the one
> running them. But seriously, please do run the all the tests that you think
> make sense.
>
> I'm particularly interested in the worst case. What is the worst case for
> the proposed memcmp() check? Test that. If the worst case regresses
> significantly, then we need to have a discussion of how likely that worst
> case is to happen in real life, what the performance is like in more
> realistic almost-worst-case scenarios, does it need to be tunable, is the
> trade-off worth it, etc. But if the worst case regresses less than, say, 1%,
> and there are some other cases where you get a 300% speed up, then I think
> it's safe to say that the optimization is worth it, without any more testing
> or discussion.

+1 to all that, including the facetious parts.

Based on discussion thus far it seems that there's a possibility that
the trade-off may be different for short strings vs. long strings. If
the string is small enough to fit in the L1 CPU cache, then it may be
that memcmp() followed by strcoll() is not much more expensive than
strcoll(). That should be easy to figure out: write a standalone C
program that creates a bunch of arbitrary, fairly-short strings, say
32 bytes, in a big array. Make the strings different near the end,
but not at the beginning. Then have the program either do strcoll()
on every pair (O(n^2)) or, with a #define, memcmp() followed by
strcoll() on every pair. It should be easy to see whether the
memcmp() adds 1% or 25% or 100%.

Then, repeat the same thing with strings that are big enough to blow
out the L1 cache, say 1MB in length. Some intermediate sizes (64kB?)
might be worth testing, too. Again, it should be easy to see what the
overhead is. Once we know that, we can make intelligent decisions
about whether this is a good idea or not, and when. If you attach the
test programs, other people (e.g. me) can also try them on other
systems (e.g. MacOS X) to see whether the characteristics there are
different than what you saw on your system.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2014-09-12 18:41:24 Re: Final Patch for GROUPING SETS
Previous Message Peter Geoghegan 2014-09-12 18:38:02 Re: jsonb contains behaviour weirdness