Re: Win32 unicode vs ICU

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Magnus Hagander" <mha(at)sollentuna(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org, "Palle Girgensohn" <girgen(at)pingpong(dot)net>
Subject: Re: Win32 unicode vs ICU
Date: 2005-08-20 16:17:47
Message-ID: 24642.1124554667@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

[ moving to -hackers for wider discussion ]

"Magnus Hagander" <mha(at)sollentuna(dot)net> wrote in
http://archives.postgresql.org/pgsql-patches/2005-08/msg00039.php

>> I've been working with Palles ICU patch to make it work on
>> win32, and I believe I have it done. While doing it I noticed
>> that ICU basically converts to UTF16 and back - I previously
>> thought it worked on UTF8 strings. Based on this I also tried
>> out an implementation for the win32-unicode problem that does
>> *not* require ICU. It uses the win32 native functions to map
>> to utf16 and back, and then to process the text there. And I
>> got through with much less code than the ICU version, while
>> doing the same thing.
>>
>> I am unsure of how to proceed. As I see it there are three paths:
>> 1) Use native win32 functionality only on win32
>> 2) Use ICU functionality only on win32
>> 3) Allow both ICU and native functionality, compile time
>> switch --with-icu (same as unix with the ICU patch)

We need to figure out what we're going to do about this. Given where
we are in the release cycle, I am pretty strongly tempted to just apply
the smaller patch (just map utf8/utf16 using Windows native functions)
for PG 8.1.

I think that ICU would be interesting as the base for a much larger
patch that gets us away from depending on libc's locale support at all
(in particular, getting rid of the "one locale per database" problem).
But it seems like a heck of a big dependency to incur for any lesser goal.

I feel it makes sense to apply the smaller patch in any case, so that
there's a Win32 solution not requiring ICU (ie, I can't see an argument
for doing (2) rather than (3)).

Comments?

Also,

> And anohter question - my native patch touches the same
> functions as the ICU patch. Can somebody who knows the
> internals confirm or deny that these are all the required
> locations, or do we need to modify more?

There is a strxfrm() call in src/backend/utils/adt/selfuncs.c,
which probably needs to be looked at too.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2005-08-20 17:18:51 Re: Win32 unicode vs ICU
Previous Message Tom Lane 2005-08-20 14:28:13 Re: Why is lock not released?

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2005-08-20 17:18:51 Re: Win32 unicode vs ICU
Previous Message Bruce Momjian 2005-08-19 13:43:05 Re: Fwd: Re: [HACKERS] For Review: Allow WAL information to recover