Re: insensitive collations

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: insensitive collations
Date: 2019-02-21 17:11:27
Message-ID: 3548cbab-2c29-be93-0c9c-ec24136626ae@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-02-21 09:36, Peter Eisentraut wrote:
>> * Why have you disable this optimization?:
>>
>>> /* Fast pre-check for equality, as discussed in varstr_cmp() */
>>> - if (len1 == len2 && memcmp(a1p, a2p, len1) == 0)
>>> + if ((!sss->locale || sss->locale->deterministic) &&
>>> + len1 == len2 && memcmp(a1p, a2p, len1) == 0)
>> I don't see why this is necessary. A non-deterministic collation
>> cannot indicate that bitwise identical strings are non-equal.
> Right, I went too far there.
>
>> * Perhaps you should add a "Tip" referencing the feature to the
>> contrib/citext documentation.
> Good idea.

Here is another patch that fixes these two points.

I have also worked on the tests hoping to appease the cfbot.

Older ICU versions (<54) don't support all the locale customization
options, so many of my new tests in collate.icu.utf8.sql will fail on
older systems. What should we do about that? Have another extra test file?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v7-0001-Collations-with-nondeterministic-comparison.patch text/plain 152.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryan David Sheasby 2019-02-21 17:12:12 Re: Journal based VACUUM FULL
Previous Message Robert Haas 2019-02-21 17:08:39 Re: ATTACH/DETACH PARTITION CONCURRENTLY