Re: Very strange Error in Updates

From: "Dario V(dot) Fassi" <software(at)sistemat(dot)com(dot)ar>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oliver Jowett <oliver(at)opencloud(dot)com>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Very strange Error in Updates
Date: 2004-07-15 16:39:58
Message-ID: 40F6B35E.4010600@sistemat.com.ar
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc


My problem it's that the data is just inside the postgresql server (with
SQL_ASCII encoding), inserted by Win32/ODBC clients.

Now from JDBC I can't handle any row with any field that has one o more
8 bits characters.
At same time , Win32/ODBC programs continue to use it without any problem.
This situation let me in a hard to explain situation.

One more question, using the PreparedStatement.setBytes() , can be done
the treatment that ODBC does with that fields ?
Thanks all for your help.
Dario.

Tom Lane wrote:

>Oliver Jowett <oliver(at)opencloud(dot)com> writes:
>
>
>>The JDBC driver always speaks UNICODE when it can, since that matches
>>Java's internal string representation. I suspect that what's happening is:
>>
>>
>>0) the driver sets client_encoding = UNICODE during connection setup
>>
>>
>Right.
>
>
>>1) the driver encodes the parameter as UNICODE (== UTF8); for characters
>>above 127 this encoding will result in more than one byte per character.
>>
>>
>
>Right.
>
>
>>2) the server converts from client_encoding UNICODE to database encoding
>>SQL_ASCII; for characters that are invalid in SQL_ASCII (>127) it does
>>some arbitary conversion, probably just copying the illegal values
>>unchanged.
>>
>>
>
>Not really. SQL_ASCII encoding basically means "we don't know what this
>data is, just store it verbatim". So the UTF-8 string sent by the
>driver is stored verbatim.
>
>
>>3) you end up with extra characters in the resulting value which exceeds
>>the varchar's size.
>>
>>
>
>Right. Since the server does not know what encoding is in use, it falls
>back to the assumption that 1 character == 1 byte, under which
>assumption the string violates the varchar(30) constraint.
>
>Had the server known which encoding was in use, it would have counted
>the characters correctly.
>
>
>>The solution is to use a database encoding that matches your data.
>>
>>
>
>Actually, if you intend to access the database primarily through JDBC,
>it'd be best to use server encoding UNICODE. The JDBC driver will
>always want UNICODE on the wire, and I see no reason to force extra
>character set conversions. Non-UNICODE-aware clients can be handled by
>setting client_encoding properly.
>
> regards, tom lane
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message joe speigle 2004-07-15 17:02:19 Re: [HACKERS] possibly updating techdocs; mysql2pgsql on gborg
Previous Message Marc G. Fournier 2004-07-15 16:07:03 Re: Release planning

Browse pgsql-jdbc by date

  From Date Subject
Next Message Kris Jurka 2004-07-15 18:18:24 Re: SSL Problem
Previous Message Tom Lane 2004-07-15 14:16:17 Re: Very strange Error in Updates