Skip site navigation (1) Skip section navigation (2)

Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Barry Lind <barry(at)xythos(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]
Date: 2001-05-31 23:48:29
Message-ID: 200105312348.f4VNmTa10061@candle.pha.pa.us (view raw or flat)
Thread:
Lists: pgsql-patches
Your patch has been added to the PostgreSQL unapplied patches list at:

	http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

> The following patch for JDBC fixes an issue with jdbc running on a 
> non-multibyte database loosing 8bit characters.  This patch will cause 
> the jdbc driver to ignore the encoding reported by the database when 
> multibyte isn't enabled and use the JVM default in that case.
> 
> thanks,
> --Barry
> 
> 
> -------- Original Message --------
> Subject: Re: [HACKERS] MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug 
> with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)
> Date: Fri, 25 May 2001 17:12:09 -0700
> From: Barry Lind 
> To: Tatsuo Ishii , tgl(at)sss(dot)pgh(dot)pa(dot)us
> References: <3AF74768(dot)8060807(at)xythos(dot)com> 
> <20010508110249R(dot)t-ishii(at)sra(dot)co(dot)jp> <3AF78113(dot)6080907(at)xythos(dot)com> 
> <20010509102305C(dot)t-ishii(at)sra(dot)co(dot)jp>
> 
> 
> 
> Tatsuo, Tom,
> 
> Since the two of you were the only two that seemed to care about this 
> thread, I am addressing you directly.  I want to come to some sort of 
> resolution.  Since it doesn't appear that anything is going to be 
> changed in the backend code inn 7.2 to address the issue here, I will 
> submit the attached patch to the jdbc code.
> 
> This patch uses the function pg_encoding_to_char(1) to determine that 
> multibyte is not enabled on the server (as suggested by Tatsuo), and in 
> that case will use the default JVM character set to convert data from 
> the backend. This is instead of the current behaviour that will force 
> all data to 7bit ascii in the non-multibyte case because 
> getdatabaseencoding() always returns SQL_ASCII for non-multibyte databases.
> 
> If I don't hear anything, I will go ahead and submit this patch.
> 
> thanks for your help on this issue.
> 
> --Barry
> 
> 
> Tatsuo Ishii wrote:
> 
> >>> Still I don't see what you are wanting in the JDBC driver if
> >>> PostgreSQL would return "UNKNOWN" indicating that the backend is not
> >>> compiled with MULTIBYTE. Do you want exact the same behavior as prior
> >>> 7.1 driver? i.e. reading data from the PostgreSQL backend, assume its
> >>> encoding default to the Java client (that is set by locale or
> >>> something else) and convert it to UTF-8. If so, that would make sense
> >>> to me...
> >> 
> >> My suggestion would be that if the jdbc client was able to determine if 
> >> the server character set was UNKNOWN (i.e. no multibyte) that it would 
> >> then use some appropriate default character set to perform conversions 
> >> to UCS2 (LATIN1 would probably make the most sence as a default).  The 
> >> jdbc driver would perform its existing behavior if the character set was 
> >> SQL_ASCII and multibyte was enabled (i.e. only support 7bit characters 
> >> just like the backend does).
> >> 
> >> Note that the user is always able to override the character set used for 
> >> conversion by setting the charSet property.
> > 
> > 
> > I see.  However I would say we could not change the current behavior
> > of the backend until 7.2 is out. It is our policy the we would not
> > add/change existing functionalities while we are in the minor release
> > cycle.
> > 
> > What about doing like this:
> > 
> > 1. call pg_encoding_to_char(1)	(actually any number except 0 is ok)
> > 
> > 2. if it returns "SQL_ASCII", then you could assume that MULTIBYTE is
> > not enbaled.
> > 
> > This is pretty ugly, but should work.
> > 
> >> Tom also mentioned that it might be possible for the server to support 
> >> setting the character set for a database even when multibyte wasn't 
> >> enabled.  That would then allow clients like jdbc to get a value from 
> >> non-multibyte enabled servers that would be more meaningful than the 
> >> current SQL_ASCII.  If this where done, then the 'UNKNOWN' hack would 
> >> not be necessary.
> > 
> > 
> > Tom's suggestion does not sound reasonable to me. If PostgreSQL is not
> > built with MULTIBYTE, then it means there would be no such idea
> > "encoding" in PostgreSQL becuase there is no program to handle
> > encodings. Thus it would be meaningless to assign an "encoding" to a
> > database if MULTIBYTE is not enabled.
> > --
> > Tatsuo Ishii
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
> > 
> > 
> 
> 
> 

> *** ./org/postgresql/Connection.java.orig	Fri May 25 16:23:02 2001
> --- ./org/postgresql/Connection.java	Fri May 25 16:26:55 2001
> ***************
> *** 267,273 ****
>         //
>         firstWarning = null;
>   
> !       java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; select getdatabaseencoding()");
>   
>         String dbEncoding = null;
>         //retrieve DB properties
> --- 267,274 ----
>         //
>         firstWarning = null;
>   
> !       java.sql.ResultSet initrset = ExecSQL("set datestyle to 'ISO'; " +
> !         "select case when pg_encoding_to_char(1) = 'SQL_ASCII' then 'UNKNOWN' else getdatabaseencoding() end");
>   
>         String dbEncoding = null;
>         //retrieve DB properties
> ***************
> *** 319,324 ****
> --- 320,330 ----
>   
>           } else if (dbEncoding.equals("WIN")) {
>             dbEncoding = "Cp1252";
> +         } else if (dbEncoding.equals("UNKNOWN")) {
> +           //This isn't a multibyte database so we don't have an encoding to use
> +           //We leave dbEncoding null which will cause the default encoding for the
> +           //JVM to be used
> +           dbEncoding = null;
>           } else {
>             dbEncoding = null;
>           }
> 
> 

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

In response to

pgsql-patches by date

Next:From: Bruce MomjianDate: 2001-06-01 04:47:15
Subject: Re: Patch to remove sort files, temp tables, unreferenced files
Previous:From: Barry LindDate: 2001-05-31 23:21:01
Subject: [Fwd: Patch for MULTIBYTE and SQL_ASCII (was Re: [JDBC] Re: A bug with pgsql 7.1/jdbc and non-ascii (8-bit) chars?)]]

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group