Re: Bug with PATHs having non-ASCII characters

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Chuck McDevitt <cmcdevitt(at)greenplum(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug with PATHs having non-ASCII characters
Date: 2010-01-07 10:26:13
Message-ID: 9837222c1001070226l274aef23h1ff34156cdae4227@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 7, 2010 at 02:37, Takahiro Itagaki
<itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>
> Chuck McDevitt <cmcdevitt(at)greenplum(dot)com> wrote:
>
>> Just an FYI regarding this bug:
>> http://archives.postgresql.org/pgsql-bugs/2009-12/msg00267.php
>>
>> The wide-char version of any WIN32 API call will accept or return
>> data in UTF-16 encoded Unicode, regardless of the local environment's
>> single-byte (MBCS) encoding settings (codepage).
>
> I have a Windows-specific patch for open(), attached for reference.
> But we need to consider about other issues:
>
>  - We need to consider about not only only open(), but also opendir(),
>    stat() and symlink().
>
>  - An entirely-different fix is needed for non-Windows platforms.
>    Probably we will convert encodings from GetDatabaseEncoding() to
>    GetPlatformEncoding() in MBCS, but this is not needed on Windows.
>    We should consider avoiding random ifdef blocks for the switching.

Shouldn't we develop this with "multi-platform" in mind from the
start, instead of doing a Windows specific patch? It may be that we
end up with two completely different codepaths, but more likely we can
share some of it between them?

>  - Those conversions are not free. We might need to avoid conversions
>    for paths under $PGDATA because we only use ascii names in the server.
>    I used a test with IS_HIGHBIT_SET in the attached patch, but I'm not
>    sure whether it is the best method.

If we're going to end up with our own wrapper anyway, we can just pass
an extra parameter to it saying if we want conversion or not? That way
we can avoid doing it for cases where we know it's safe, but do it
when user-input is included?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tim Bunce 2010-01-07 10:30:18 Re: Testing plperl<->plperlu interaction
Previous Message Dave Page 2010-01-07 10:15:01 Re: libpq naming on Win64