From: | "Dario V(dot) Fassi" <software(at)sistemat(dot)com(dot)ar> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Oliver Jowett <oliver(at)opencloud(dot)com>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org> |
Subject: | Re: Very strange Error in Updates |
Date: | 2004-07-15 16:39:58 |
Message-ID: | 40F6B35E.4010600@sistemat.com.ar |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-jdbc |
My problem it's that the data is just inside the postgresql server (with
SQL_ASCII encoding), inserted by Win32/ODBC clients.
Now from JDBC I can't handle any row with any field that has one o more
8 bits characters.
At same time , Win32/ODBC programs continue to use it without any problem.
This situation let me in a hard to explain situation.
One more question, using the PreparedStatement.setBytes() , can be done
the treatment that ODBC does with that fields ?
Thanks all for your help.
Dario.
Tom Lane wrote:
>Oliver Jowett <oliver(at)opencloud(dot)com> writes:
>
>
>>The JDBC driver always speaks UNICODE when it can, since that matches
>>Java's internal string representation. I suspect that what's happening is:
>>
>>
>>0) the driver sets client_encoding = UNICODE during connection setup
>>
>>
>Right.
>
>
>>1) the driver encodes the parameter as UNICODE (== UTF8); for characters
>>above 127 this encoding will result in more than one byte per character.
>>
>>
>
>Right.
>
>
>>2) the server converts from client_encoding UNICODE to database encoding
>>SQL_ASCII; for characters that are invalid in SQL_ASCII (>127) it does
>>some arbitary conversion, probably just copying the illegal values
>>unchanged.
>>
>>
>
>Not really. SQL_ASCII encoding basically means "we don't know what this
>data is, just store it verbatim". So the UTF-8 string sent by the
>driver is stored verbatim.
>
>
>>3) you end up with extra characters in the resulting value which exceeds
>>the varchar's size.
>>
>>
>
>Right. Since the server does not know what encoding is in use, it falls
>back to the assumption that 1 character == 1 byte, under which
>assumption the string violates the varchar(30) constraint.
>
>Had the server known which encoding was in use, it would have counted
>the characters correctly.
>
>
>>The solution is to use a database encoding that matches your data.
>>
>>
>
>Actually, if you intend to access the database primarily through JDBC,
>it'd be best to use server encoding UNICODE. The JDBC driver will
>always want UNICODE on the wire, and I see no reason to force extra
>character set conversions. Non-UNICODE-aware clients can be handled by
>setting client_encoding properly.
>
> regards, tom lane
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | joe speigle | 2004-07-15 17:02:19 | Re: [HACKERS] possibly updating techdocs; mysql2pgsql on gborg |
Previous Message | Marc G. Fournier | 2004-07-15 16:07:03 | Re: Release planning |
From | Date | Subject | |
---|---|---|---|
Next Message | Kris Jurka | 2004-07-15 18:18:24 | Re: SSL Problem |
Previous Message | Tom Lane | 2004-07-15 14:16:17 | Re: Very strange Error in Updates |