Re: Very strange Error in Updates

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Dario V(dot) Fassi" <software(at)sistemat(dot)com(dot)ar>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Very strange Error in Updates
Date: 2004-07-15 23:34:45
Message-ID: 40F71495.20402@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

Tom Lane wrote:

>>2) the server converts from client_encoding UNICODE to database encoding
>>SQL_ASCII; for characters that are invalid in SQL_ASCII (>127) it does
>>some arbitary conversion, probably just copying the illegal values
>>unchanged.
>
>
> Not really. SQL_ASCII encoding basically means "we don't know what this
> data is, just store it verbatim". So the UTF-8 string sent by the
> driver is stored verbatim.

Hmm, so SQL_ASCII is not really a first-class encoding -- it doesn't do
encoding conversions at all? It's going to break horribly in the face of
clients using different client_encoding values, and somewhat less
horribly even when everything uses a client_encoding of UNICODE (i.e.
string lengths are wrong)?

I wonder if the server behaviour could be somehow changed so that people
don't shoot themselves in the foot so often (variants on this problem
come up again and again..). The problem is that it works most of the
time, only breaking on certain data, so it's not instantly apparent that
you have a problem.

What about refusing to change client_encoding to something other than
SQL_ASCII on SQL_ASCII databases? (This would make the JDBC driver
unusable against those database even for data that currently appears to
work, though)

Or perhaps the JDBC driver could issue a warning whenever it notices the
underlying encoding is SQL_ASCII (this means another round-trip on
connection setup even when using V3 though). Or refuse to even try to
encode strings with characters >127 when the database encoding is SQL_ASCII.

>>The solution is to use a database encoding that matches your data.
>
> Actually, if you intend to access the database primarily through JDBC,
> it'd be best to use server encoding UNICODE. The JDBC driver will
> always want UNICODE on the wire, and I see no reason to force extra
> character set conversions. Non-UNICODE-aware clients can be handled by
> setting client_encoding properly.

Sure -- it just depends on what other clients use the db. By the sounds
of it in this case the other client is an ODBC client that isn't aware
of encodings at all.. I suppose this can be handled by the default
client_encoding setting in postgresql.conf?

-O

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oliver Jowett 2004-07-15 23:36:15 Re: Very strange Error in Updates
Previous Message Mark Kirkwood 2004-07-15 23:29:29 Re: Point in Time Recovery

Browse pgsql-jdbc by date

  From Date Subject
Next Message Oliver Jowett 2004-07-15 23:36:15 Re: Very strange Error in Updates
Previous Message Oliver Jowett 2004-07-15 23:24:07 Re: Very strange Error in Updates