Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)

From: Barry Lind <barry(at)xythos(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Date: 2001-05-08 05:16:03
Message-ID: 3AF78113.6080907@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

Tatsuo Ishii wrote:

>>>>> Thus I would be happy if getdatabaseencoding() returned 'UNKNOWN' or
>>>>> something similar when in fact it doesn't know what the encoding is
>>>>> (i.e. when not compiled with multibyte).
>>>>
>>> Is that ok for Java? I thought Java needs to know the encoding
>>> beforehand so that it could convert to/from Unicode.
>>
>> That is actually the original issue that started this thread. If you
>> want the full thread see the jdbc mail archive list. A user was
>> complaining that when running on a database without multibyte enabled,
>> that through psql he could insert and retrieve 8bit characters, but in
>> jdbc the 8bit characters were converted to ?'s.
>
>
> Still I don't see what you are wanting in the JDBC driver if
> PostgreSQL would return "UNKNOWN" indicating that the backend is not
> compiled with MULTIBYTE. Do you want exact the same behavior as prior
> 7.1 driver? i.e. reading data from the PostgreSQL backend, assume its
> encoding default to the Java client (that is set by locale or
> something else) and convert it to UTF-8. If so, that would make sense
> to me...

My suggestion would be that if the jdbc client was able to determine if
the server character set was UNKNOWN (i.e. no multibyte) that it would
then use some appropriate default character set to perform conversions
to UCS2 (LATIN1 would probably make the most sence as a default). The
jdbc driver would perform its existing behavior if the character set was
SQL_ASCII and multibyte was enabled (i.e. only support 7bit characters
just like the backend does).

Note that the user is always able to override the character set used for
conversion by setting the charSet property.

Tom also mentioned that it might be possible for the server to support
setting the character set for a database even when multibyte wasn't
enabled. That would then allow clients like jdbc to get a value from
non-multibyte enabled servers that would be more meaningful than the
current SQL_ASCII. If this where done, then the 'UNKNOWN' hack would
not be necessary.

thanks,
--Barry

>
> --
> Tatsuo Ishii
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2001-05-08 05:23:16 RE: Duplicate constraint names in 7.0.3
Previous Message Christopher Kings-Lynne 2001-05-08 05:11:47 RE: Duplicate constraint names in 7.0.3

Browse pgsql-jdbc by date

  From Date Subject
Next Message Thys De Wet@iCommerce 2001-05-08 07:08:08 Jbuilder Question..
Previous Message Tatsuo Ishii 2001-05-08 02:02:49 Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)