Re: Encoding issues in console and eventlog on win32

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Encoding issues in console and eventlog on win32
Date: 2009-10-07 02:28:08
Message-ID: 20091007101008.9563.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Magnus Hagander <magnus(at)hagander(dot)net> wrote:

> First of all, the change to port/open.c seems to be unrelated to the
> rest, and should be a separate patch, correct? I'm sure there's a
> usecase for it, but it's not actually included in the patches
> description, so I assume this was a mistake?

It was just a demo for pgwin32_toUTF16(). I'll remove this part from
the patch, but I think we also need to fix the encoding mismatch issue
in path strings. I'll re-submit for the next commitfest.

> Per your own comments earlier, and in the code, what will happen if
> pg_do_encoding_conversion() calls ereport()? Didn't you say we need a
> non-throwing version of it?

We are hard to use encoding conversion functions in logging routines
because they could throw errors if there are some unconvertable characters.
Non-throwing version will convert such characters into '?' or escaped form
(something like \888 or \xFF). If there where such infrastructure, we can
support "log_encoding" settings and convert messages in platform-dependent
encoding before writing to syslog or console.

> pgwin32_toUTF16() needs error checking on the API calls, and needs to
> do something reasonable if it fails.

Now it returns NULL and caller writes messages in the original encoding.
Also I added the following error checks before calling pgwin32_toUTF16()
(errordata_stack_depth < ERRORDATA_STACK_SIZE - 1)
to avoid recursive errors, but I'm not sure it is really meaningful.
Please remove or rewrite this part if it is not a right way.

> The encoding_to_codepage array needs to go in encnames.c, where other
> such tables are. Perhaps it can even be integrated in pg_enc2name_tbl
> as a separate field?

I added pg_enc2name.codepage. Note that this field is needed only
on Windows, but now exported for all platforms. If you don't like
the useless field, the following macro could be a help.

#ifdef WIN32
#define def_enc2name(name, codepage) { #name, PG_##name, codepage }
#else
#define def_enc2name(name, codepage) { #name, PG_##name }
#endif
pg_enc2name pg_enc2name_tbl[] =
{
def_enc2name(SQL_ASCII),
def_enc2name(EUC_JP),
...

> I don't have the time to clean this up right now, so if you have,
> please do so and resubmit. If not, I can clean it up later and apply.

Patch attached.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
eventlog_20091007.patch application/octet-stream 13.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2009-10-07 07:17:20 Re: COPY enhancements
Previous Message Emmanuel Cecchet 2009-10-07 02:07:31 Re: COPY enhancements