Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal

From: Reinhard Max <max(at)suse(dot)de>
To: Vsevolod Lobko <seva(at)sevasoft(dot)kiev(dot)ua>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Patch for pl/tcl Tcl_ExternalToUtf and Tcl_UtfToExternal
Date: 2001-09-04 12:08:05
Message-ID: Pine.LNX.4.33.0109041317540.8768-100000@wotan.suse.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Hi,

sorry for stepping late into this discussion.
I've been on vacation for two weeks.

On Thu, 23 Aug 2001, Vsevolod Lobko wrote:

> > > Patch assumes that database encoding and system encoding of Tcl is
> > > equal.
> >
> > Hmm, is that a tenable assumption? I don't know, I'm just asking.
>
> Yes, because it does 8-bit to unicode conversion and must to know
> codepage for 8-bit characters. Unfortunately charset names for tcl
> and postgres does not match, so this demands additional field in
> charset tables or additional table :((

I think you can't assume that a database has always the same encoding
as Tcl's system encoding. For pl/tcl you could set the system encoding
to the database's encoding, but then you'd need that additional name
conversion table anyway be it a database table or hardcoded. For PgTcl
it is definitely up to the user which system encoding the interpreter
has.

I for example create my databases in UNICODE (to get PostgreSQL
working with Tcl 8.3 and without patching pl/tcl or PgTcl), but my
Tcl-Interpreter's system encoding is iso-8859-1.

So basically there are two possibilities:

a) Patch pl/tcl and PgTcl to do the code conversion, but do it right
by using the Database's encoding instead of Tcl's system encoding.

b) Require databases to be in UNICODE if they are to be accessed
from Tcl >= 8.1 so that the strings that come out of the database
are already UTF-8.

For b) it would be nice to have a per-database attribute that
specifies the default client encoding that is used for clients that
don't explicitely set an encoding. I think of something like:

$ createdb --encoing UNICODE --default-client-encoding LATIN1 foo

This database could be used from Tcl without any code conversion, but
would look like it was in LATIN1 for other clients (e.g. psql) if they
don't explicitely set an encoding.

I'd vote for b), because I think there is a general movement towards
Unicode anyways.

cu
Reinhard

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Karel Zak 2001-09-04 13:42:02 Re: [PATCHES] to_char and Roman Numeral (RN) bug
Previous Message Peter Eisentraut 2001-09-04 10:25:48 Re: Bytea/Base64 encoders for libpq - interested?