Re: Encoding issues in console and eventlog on win32

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Encoding issues in console and eventlog on win32
Date: 2009-10-06 12:09:10
Message-ID: 9837222c0910060509k3a360a55q75568223d8eec002@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2009/9/15 Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>:
>
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>
>> Can't we use MultiByteToWideChar() to convert directly to the required
>> encoding, avoiding the double conversion?
>
> Here is an updated version of the patch.
> I use direct conversion in pgwin32_toUTF16() if a corresponding codepage
> is available. If not available, I still use double conversion.
>
> Now pgwin32_toUTF16() is exported from mbutil.c. I used the function
> in following parts, although the main target of the patch is eventlog.
>
>  * WriteConsoleW() - write unredirected stderr log.
>  * ReportEventW()  - write evenlog.
>  * CreateFileW()   - open non-ascii filename (ex. COPY TO/FROM 'mb-path').
>
> This approach is only available for Windows because any other platform
> don't support locale-independent and wide-character-based system calls.
> Other platforms require a different approach, but even then we'd still
> better have win32-specific routines because UTF16 is the native encoding
> in Windows.

I did a quick check of this, and here are the things I would like to
have changed:

First of all, the change to port/open.c seems to be unrelated to the
rest, and should be a separate patch, correct? I'm sure there's a
usecase for it, but it's not actually included in the patches
description, so I assume this was a mistake?

Per your own comments earlier, and in the code, what will happen if
pg_do_encoding_conversion() calls ereport()? Didn't you say we need a
non-throwing version of it?

pgwin32_toUTF16() needs error checking on the API calls, and needs to
do something reasonable if it fails. For example, it can fail because
of out of memory error. I suggest just returning the error code in
some way in that case, and have the callers fall back to logging in
the incorrect encoding - in a lot of cases that will produce an at
least partially readable message. A second message should also be
logged saying that the conversion failed - this needs to be done
directly with the eventlog API functions and not ereport, so we don't
end up in infinite recursion.

The encoding_to_codepage array needs to go in encnames.c, where other
such tables are. Perhaps it can even be integrated in pg_enc2name_tbl
as a separate field?

I don't have the time to clean this up right now, so if you have,
please do so and resubmit. If not, I can clean it up later and apply.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2009-10-06 12:54:58 Re: [PATCH] DefaultACLs
Previous Message Andreas 'ads' Scherbaum 2009-10-06 11:57:06 Patch: create or replace language