Re: Getting server crash on Windows when using ICU collation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Getting server crash on Windows when using ICU collation
Date: 2017-06-16 10:30:47
Message-ID: CAA4eK1LVW+cWuVt5yU=ECF+QnkJ6mmkOCjsiqO5M7dScK4EoKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 15, 2017 at 11:18 PM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> Hi,
>
> On Thu, Jun 15, 2017 at 8:36 PM, Peter Eisentraut
> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>> On 6/12/17 00:38, Ashutosh Sharma wrote:
>>> PFA patch that fixes the issue described in above thread. As mentioned
>>> in the above thread, the crash is basically happening in varstr_cmp()
>>> function and it's only happening on Windows because in varstr_cmp(),
>>> if the collation provider is ICU, we don't even think of calling ICU
>>> functions to compare the string. Infact, we directly attempt to call
>>> the OS function wsccoll*() which is not expected. Thanks.
>>
>> Maybe just
>>
>> diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
>> index a0dd391f09..2506f4eeb8 100644
>> --- a/src/backend/utils/adt/varlena.c
>> +++ b/src/backend/utils/adt/varlena.c
>> @@ -1433,7 +1433,7 @@ varstr_cmp(char *arg1, int len1, char *arg2, int len2, Oid collid)
>>
>> #ifdef WIN32
>> /* Win32 does not have UTF-8, so we need to map to UTF-16 */
>> - if (GetDatabaseEncoding() == PG_UTF8)
>> + if (GetDatabaseEncoding() == PG_UTF8 && (!mylocale || mylocale->provider == COLLPROVIDER_LIBC))
>> {
>> int a1len;
>> int a2len;
>
> Oh, yes, this looks like the simplest and possibly the ideal way to
> fix the issue. Attached is the patch. Thanks for the inputs.
>

How will this compare UTF-8 strings in UTF-8 encoding? It seems to me
that ideally, it should use ucol_strcollUTF8 to compare the same,
however, with patch, it will always ucol_strcoll as we never define
HAVE_UCOL_STRCOLLUTF8 flag on Windows. We have some multi-byte tests
in src/test/mb directory, see if we can use those to verify these
changes. I admit that I have not tried to execute those on Windows,
so I have no idea if those even work.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marina Polyakova 2017-06-16 10:31:27 Re: WIP Patch: Pgbench Serialization and deadlock errors
Previous Message Marina Polyakova 2017-06-16 10:26:28 Re: WIP Patch: Pgbench Serialization and deadlock errors