Re: strcmp() tie-breaker for identical ICU-collated strings

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strcmp() tie-breaker for identical ICU-collated strings
Date: 2017-07-11 01:44:01
Message-ID: CAH2-Wz=KryJQHCGhGpTfthDcxLEvbszMGsSan=gYBHs74DXicg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 9, 2017 at 11:09 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Isn't that what strxfrm() is?
>
> Yeah, just with bugs. If ICU has a non-buggy equivalent, then we can
> make this work.

I agree that it probably isn't worth using strxfrm() again, simply
because the glibc implementation is buggy, and glibc as a project is
not at all concerned about how badly that would affect PostgreSQL.

I would like to point out on this thread that the strcmp() tie-breaker
is also a big blocker to implementing normalized keys in B-Tree
indexes (at least, if you want to get them for collated text, which I
think you really need to make the implementation effort worth it).
This is something that is discussed in a section on the normalized
keys wiki page I created recently [1].

[1] https://wiki.postgresql.org/wiki/Key_normalization#ICU.2C_text_equality_semantics.2C_and_hashing
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-07-11 01:47:17 Re: New partitioning - some feedback
Previous Message Amit Langote 2017-07-11 01:38:00 Re: BUG #14738: ALTER SERVER for foregin servers not working