Re: B-Tree support function number 3 (strxfrm() optimization)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Wim Lewis <wiml(at)omnigroup(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Date: 2014-07-29 00:23:54
Message-ID: CAM3SWZTy3MxgXbG6373cWh9rex=RhmAj6-kuXx03yRX5-vYKgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 28, 2014 at 5:14 PM, Wim Lewis <wiml(at)omnigroup(dot)com> wrote:
> A quick glance at OSX's strxfrm() suggests they're using an implementation of strxfrm() from FreeBSD. You can find the source here:
>
> http://www.opensource.apple.com/source/Libc/Libc-997.90.3/string/FreeBSD/strxfrm.c
>
> (and a really quick glance at the contents of libc on OSX 10.9 reinforces this--- I don't see any calls into their CoreFoundation unicode string APIs.)

Something isn't quite accounted for, then. The FreeBSD behavior is to
append the primary weights only. That makes their returned blobs
smaller than those you'll see on Linux, but also appears to imply that
their implementation is substandard (The PostgreSQL port uses ICU on
FreeBSD for a reason, I suppose). But FreeBSD did not add extra,
redundant "header bytes" right in the primary level when I tested it,
but I'm told Mac OS X does. I guess it could be that the collations
shipped differ, but I can't think why that would be. It does seem
peculiar that the Mac OS X blobs are always printable, whereas that
isn't the case with Glibc (the only restriction like that is that
there are no NULL bytes), and the Unicode algorithm standard
specifically says that that's okay.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-07-29 01:14:01 Re: Reminder: time to stand down from 8.4 maintenance
Previous Message Wim Lewis 2014-07-29 00:14:18 Re: B-Tree support function number 3 (strxfrm() optimization)