Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: barry(at)xythos(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Date: 2001-05-09 01:23:05
Message-ID: 20010509102305C.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

> > Still I don't see what you are wanting in the JDBC driver if
> > PostgreSQL would return "UNKNOWN" indicating that the backend is not
> > compiled with MULTIBYTE. Do you want exact the same behavior as prior
> > 7.1 driver? i.e. reading data from the PostgreSQL backend, assume its
> > encoding default to the Java client (that is set by locale or
> > something else) and convert it to UTF-8. If so, that would make sense
> > to me...
>
> My suggestion would be that if the jdbc client was able to determine if
> the server character set was UNKNOWN (i.e. no multibyte) that it would
> then use some appropriate default character set to perform conversions
> to UCS2 (LATIN1 would probably make the most sence as a default). The
> jdbc driver would perform its existing behavior if the character set was
> SQL_ASCII and multibyte was enabled (i.e. only support 7bit characters
> just like the backend does).
>
> Note that the user is always able to override the character set used for
> conversion by setting the charSet property.

I see. However I would say we could not change the current behavior
of the backend until 7.2 is out. It is our policy the we would not
add/change existing functionalities while we are in the minor release
cycle.

What about doing like this:

1. call pg_encoding_to_char(1) (actually any number except 0 is ok)

2. if it returns "SQL_ASCII", then you could assume that MULTIBYTE is
not enbaled.

This is pretty ugly, but should work.

> Tom also mentioned that it might be possible for the server to support
> setting the character set for a database even when multibyte wasn't
> enabled. That would then allow clients like jdbc to get a value from
> non-multibyte enabled servers that would be more meaningful than the
> current SQL_ASCII. If this where done, then the 'UNKNOWN' hack would
> not be necessary.

Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
built with MULTIBYTE, then it means there would be no such idea
"encoding" in PostgreSQL becuase there is no program to handle
encodings. Thus it would be meaningless to assign an "encoding" to a
database if MULTIBYTE is not enabled.
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-05-09 02:40:22 Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Previous Message Ian Lance Taylor 2001-05-09 01:02:17 Re: Re: Outstanding patches

Browse pgsql-jdbc by date

  From Date Subject
Next Message Tom Lane 2001-05-09 02:40:22 Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
Previous Message Ian Lance Taylor 2001-05-09 01:02:17 Re: Re: Outstanding patches