Skip site navigation (1) Skip section navigation (2)

Re: Bug or not about ASCII and Multi-Byte character set

From: Andreas Pflug <pgadmin(at)pse-consulting(dot)de>
To: Marc Herbert <Marc(dot)Herbert(at)emicnetworks(dot)com>
Cc: pgsql-odbc(at)postgresql(dot)org
Subject: Re: Bug or not about ASCII and Multi-Byte character set
Date: 2005-08-19 14:11:48
Message-ID: 4305E8A4.6000306@pse-consulting.de (view raw or flat)
Thread:
Lists: pgsql-odbc
Marc Herbert wrote:

>If SQL_ASCII is/was equivalent to "ignoring encoding", then it
>looks/looked pretty misnamed! 
>
It's not. It should be used for ASCII only, but the database system will 
not barf if you offer it a byte with the upper bit set. You're simply on 
your own.

>Encoding ignorance should rather be called SQL_BINARY. A BINARY setting
>for strings makes sense, just like when transfering text files using
>FTP: you just don't trust FTP for encodings and use it like a
>filesystem. BINARY just means that: "don't mess-up with encodings and
>let something else deal with the issue".
>  
>
No, binary would include 0x00 which is definitely *not* a character but 
the string terminator. If SQL_ASCII would be implemented nowadays, there 
probably would be a check for the upper bit cleared, and have it 
rejected otherwise. But since this part is really really old, this can't 
be changed without breaking zillions of old apps that used to ignore 
proper storage encoding.

>I guess some people knew what they did and simply did not mixed
>driver/apps, or in a way they mastered and that worked.
>  
>
The latter, with the obvious chance to break if the next app accesses 
the data. This is certainly not the design goal of a RDBMS.

>Well while reading at the complaints it seems this BINARY mode was
>there before (by "accident"?), 
>
No.

>Looks like people fixed issues by themselves before,
>
They didn't fix anything, they worked around the wrong chosen server 
encoding. I perfectly understand this, because initially I did the same 
mistake.

> and Postgres
>recent fixing does not interact nicely with theirs?
>  
>
Automatically choosing the right client encoding and properly converting 
in the driver did (and maybe still has) bugs, but fixing these will 
certainly support the rules as proper design requires it, not 
ill-designed apps.

>PS: BTW "unicode" is not one encoding but many different ones.
>  
>
Doesn't matter. Always means the current Unicode for the system: in the 
backend UTF-8, on Win32 UCS16, Linux UCS32 or UTF-8 dependent on 
interface definition. The *driver* has to take care of the proper 
conversion, *if* it is instructed correctly (i.e. correct server encoding)

Regards,
Andreas


In response to

Responses

pgsql-odbc by date

Next:From: Marc HerbertDate: 2005-08-19 18:05:03
Subject: Re: Bug or not about ASCII and Multi-Byte character set
Previous:From: Joel FradkinDate: 2005-08-19 12:40:15
Subject: Re: Bug or not about ASCII and Multi-Byte character set

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group