Skip site navigation (1) Skip section navigation (2)

Re: Problems with charsets, investigated...

From: Alexandre Aufrere <alexandre(dot)aufrere(at)inet6(dot)fr>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Problems with charsets, investigated...
Date: 2004-08-06 18:32:08
Message-ID: 20040806183208.75F47400E5@smtp.ies.inet6.fr (view raw or flat)
Thread:
Lists: pgsql-jdbc
Well, no, actually i want to use LATIN1/ISO-8859-1 everywhere. 
So my appserver should get ISO-8859-1 string from the driver, and not 
UTF-8. 
Why ? because we have a lot bunch of apps developped in ISO-8859-1, and as 
well a lot of data in LATIN1, and it's out of question to put everything 
in UTF-8/UNICODE. 

For me, the driver should get strings encoded accordingly to the system 
properties of the JVM it is run in. Or at least there should be a way to 
tell the driver what charset to use. In other means, the current behaviour 
is precisely NOT transparent to me, because i end up with a database in 
LATIN1, whose data are converted in UTF-8 before i retrieve them from the 
JDBC driver, which 1) would give me more work to convert back to 
ISO-8859-1, and 2) would not be backward compatible (meaning have to test 
again a LOT of apps to check we're breaking nothing). 

So my hack just gets the file.encoding java system property, and requests 
data to the postgresql server and handle it accordingly (namely if 
file.encoding is ISO-8859-1, it requests LATIN1, and handles everything it 
gets in ISO-8859-1). 
Now, IMHO, ideally, the default behaviour of the JDBC driver should be to 
get the encoding from pg_database table, and deduce what encoding to use 
for the strings. And of course, there should be an easy way to change that 
for people who want it other way. 

I don't know how exactly it was working in previous versions, the fact is 
that with LANG environment variables set everywhere to en_US.ISO-8859-1 
and encoding in pg_database set to 8 (LATIN1), it just worked (we are 
using postgresql+java+Enhydra for a long long time). Any change in that 
that would involve us having to handle the charsets explicitly might be 
"ideally" right, but is not backward compatible and will cause us a lot of 
problems (and i'm quite sure not only to us). 

Lastly, it's highly possible that i didn't see something somewhere, so i 
apologize in advance for being utterly dumb ;-) 

Regards,

Alexandre Aufrere

----------------------------------------------------
De : Kris Jurka <books(at)ejurka(dot)com>
A : Alexandre Aufrere <alexandre(dot)aufrere(at)inet6(dot)fr>
Objet : Re: [JDBC] Problems with charsets, investigated... 
Date : Fri, 6 Aug 2004 11:05:54 -0500 (EST)
> 
> 
> On Fri, 6 Aug 2004, Alexandre Aufrere wrote:
> 
> > Java correctly sets its file.encoding property to the charset 
specified 
> > in the LANG environment variable. However, it appears that whatever i
> > set this variable to, the JDBC driver seems to use UTF-8.
> > 
> 
> I'm not sure what problem or issue you think this is addressing, but it 
is 
> not something we want to do.  The driver communicates with the server
> using UTF-8, so you should not be adjusting this and it is entirely
> transparent to the user.  What you do after retrieving data is your
> business and you are welcome to save it or display it in any encoding 
you 
> desire, but the driver wants to communicate with the server using UTF-8.
> 
> Kris Jurka
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>       message can get through to the mailing list cleanly


In response to

pgsql-jdbc by date

Next:From: Dave CramerDate: 2004-08-06 19:21:41
Subject: Re: Problems with big tables.
Previous:From: Jose Miguel Madinaveitia RamirezDate: 2004-08-06 17:53:49
Subject: Problems with big tables.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group