Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: barry(at)xythos(dot)com, pgsql-hackers(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Date: 2001-05-09 02:40:22
Message-ID: 18497.989376022@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
>> Tom also mentioned that it might be possible for the server to support
>> setting the character set for a database even when multibyte wasn't
>> enabled. That would then allow clients like jdbc to get a value from
>> non-multibyte enabled servers that would be more meaningful than the
>> current SQL_ASCII. If this where done, then the 'UNKNOWN' hack would
>> not be necessary.

> Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
> built with MULTIBYTE, then it means there would be no such idea
> "encoding" in PostgreSQL becuase there is no program to handle
> encodings. Thus it would be meaningless to assign an "encoding" to a
> database if MULTIBYTE is not enabled.

Why? Without the MULTIBYTE code, the backend cannot perform character
set translations --- but it's perfectly possible that someone might not
need translations. A lot of European sites are probably very happy
as long as the server gives them back the same 8-bit characters they
stored. But what they would like, if they have to deal with tools like
JDBC, is to *identify* what character set they are storing data in, so
that their data will be correctly translated to Unicode or whatever.
The obvious way to do that is to allow them to set the value that
getdatabaseencoding() will return.

Essentially, my point is that identifying the character set is useful
to support outside-the-database character set conversions, whether or
not we have compiled the code for inside-the-database conversions.
Moreover, the stored data certainly has some encoding, whether or not
the database contains code that knows enough to do anything useful about
the encoding. So it's not "meaningless" to be able to store and report
an encoding value.

I am not sure how much of the MULTIBYTE code would have to be activated
to allow this, but surely it's only a small fraction of the complete
feature.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2001-05-09 07:00:08 Patch to ALTER TABLE docs
Previous Message Tatsuo Ishii 2001-05-09 01:23:05 Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)

Browse pgsql-jdbc by date

  From Date Subject
Next Message Palle Girgensohn 2001-05-09 02:45:56 Re: [JDBC] Re: Trouble with JDBC2 ResultSet.getDate()
Previous Message Tatsuo Ishii 2001-05-09 01:23:05 Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)