Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Barry Lind <barry(at)xythos(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]
Date: 2001-06-01 20:57:45
Message-ID: 200106012057.f51Kvjv01558@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Patch applied. Thanks.

> The following patch for JDBC fixes an issue with jdbc running on a
> non-multibyte database loosing 8bit characters. This patch will cause
> the jdbc driver to ignore the encoding reported by the database when
> multibyte isn't enabled and use the JVM default in that case.
>
> thanks,
> --Barry
>
>
> -------- Original Message --------
> Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug
> with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
> Date: Fri, 25 May 2001 17:12:09 -0700
> From: Barry Lind
> To: Tatsuo Ishii , tgl(at)sss(dot)pgh(dot)pa(dot)us
> References: <3AF74768(dot)8060807(at)xythos(dot)com>
> <20010508110249R(dot)t-ishii(at)sra(dot)co(dot)jp> <3AF78113(dot)6080907(at)xythos(dot)com>
> <20010509102305C(dot)t-ishii(at)sra(dot)co(dot)jp>
>
>
>
> Tatsuo, Tom,
>
> Since the two of you were the only two that seemed to care about this
> thread, I am addressing you directly. I want to come to some sort of
> resolution. Since it doesn't appear that anything is going to be
> changed in the backend code inn 7.2 to address the issue here, I will
> submit the attached patch to the jdbc code.
>
> This patch uses the function pg_encoding_to_char(1) to determine that
> multibyte is not enabled on the server (as suggested by Tatsuo), and in
> that case will use the default JVM character set to convert data from
> the backend. This is instead of the current behaviour that will force
> all data to 7bit ascii in the non-multibyte case because
> getdatabaseencoding() always returns SQL_ASCII for non-multibyte databases.
>
> If I don't hear anything, I will go ahead and submit this patch.
>
> thanks for your help on this issue.
>
> --Barry
>
>
> Tatsuo Ishii wrote:
>
> >>> Still I don't see what you are wanting in the JDBC driver if
> >>> PostgreSQL would return "UNKNOWN" indicating that the backend is not
> >>> compiled with MULTIBYTE. Do you want exact the same behavior as prior
> >>> 7.1 driver? i.e. reading data from the PostgreSQL backend, assume its
> >>> encoding default to the Java client (that is set by locale or
> >>> something else) and convert it to UTF-8. If so, that would make sense
> >>> to me...
> >>
> >> My suggestion would be that if the jdbc client was able to determine if
> >> the server character set was UNKNOWN (i.e. no multibyte) that it would
> >> then use some appropriate default character set to perform conversions
> >> to UCS2 (LATIN1 would probably make the most sence as a default). The
> >> jdbc driver would perform its existing behavior if the character set was
> >> SQL_ASCII and multibyte was enabled (i.e. only support 7bit characters
> >> just like the backend does).
> >>
> >> Note that the user is always able to override the character set used for
> >> conversion by setting the charSet property.
> >
> >
> > I see. However I would say we could not change the current behavior
> > of the backend until 7.2 is out. It is our policy the we would not
> > add/change existing functionalities while we are in the minor release
> > cycle.
> >
> > What about doing like this:
> >
> > 1. call pg_encoding_to_char(1) (actually any number except 0 is ok)
> >
> > 2. if it returns "SQL_ASCII", then you could assume that MULTIBYTE is
> > not enbaled.
> >
> > This is pretty ugly, but should work.
> >
> >> Tom also mentioned that it might be possible for the server to support
> >> setting the character set for a database even when multibyte wasn't
> >> enabled. That would then allow clients like jdbc to get a value from
> >> non-multibyte enabled servers that would be more meaningful than the
> >> current SQL_ASCII. If this where done, then the 'UNKNOWN' hack would
> >> not be necessary.
> >
> >
> > Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
> > built with MULTIBYTE, then it means there would be no such idea
> > "encoding" in PostgreSQL becuase there is no program to handle
> > encodings. Thus it would be meaningless to assign an "encoding" to a
> > database if MULTIBYTE is not enabled.
> > --
> > Tatsuo Ishii
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> >
> >
>
>
>

> *** ./org/postgresql/Connection.java.orig Fri May 25 16:23:02 2001
> --- ./org/postgresql/Connection.java Fri May 25 16:26:55 2001
> ***************
> *** 267,273 ****
> //
> firstWarning = null;
>
> ! java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; select getdatabaseencoding()");
>
> String dbEncoding = null;
> //retrieve DB properties
> --- 267,274 ----
> //
> firstWarning = null;
>
> ! java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; " +
> ! "select case when pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN' else getdatabaseencoding() end");
>
> String dbEncoding = null;
> //retrieve DB properties
> ***************
> *** 319,324 ****
> --- 320,330 ----
>
> } else if (dbEncoding.equals("WIN")) {
> dbEncoding = "Cp1252";
> + } else if (dbEncoding.equals("UNKNOWN")) {
> + //This isn't a multibyte database so we don't have an encoding to use
> + //We leave dbEncoding null which will cause the default encoding for the
> + //JVM to be used
> + dbEncoding = null;
> } else {
> dbEncoding = null;
> }
>
>

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2001-06-01 21:01:11 Re: show all;
Previous Message Bruce Momjian 2001-06-01 20:55:54 Re: show all;