Quick Links

Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text

From:	Bart Samwel <bart(at)samwel(dot)tk>
To:	Marc Herbert <Marc(dot)Herbert(at)continuent(dot)com>
Cc:	pgsql-odbc(at)postgresql(dot)org
Subject:	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text
Date:	2006-04-01 01:26:58
Message-ID:	442DD6E2.5070500@samwel.tk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-odbc

Marc Herbert wrote:
> Johann Zuschlag <zuschlag2(at)online(dot)de> writes:
>
>> I've read about the problems with the NULL bytes on Unix machines.
>
> This problem is not related to Unix at all but to the programming
> language used. Most standard C functions use the zero byte convention
> as a string terminator, so it becomes a forbidden character in C.
>
> On the other hand String objects in C++ and Java use a separate length
> field, and having NULLs inside a string is a no brainer there.
>
> The ODBC API has been designed for C and Cobol. Cobol does not forbid
> zero as a character either. When browsing the ODBC spec you'll notice
> it carefully caters for the two ways.
>
>
> Guess which programming language is used PostgreSQL.

C++ even introduced a special alternative character type "wchar_t" for
this, just so that people could handle both 8-bit char* and 16-bit
wchar_t* strings. In wchar_t* strings, 8-bit NULs are not a problem
because only 16-bit NULs count (and AFAIK the Unicode standard does
allows this to be interpreted as a NUL aka end-of-string). The downside
of this solution is that no application actually uses it, and everybody
is stuck with 8-bit ASCII plus a random local codepage unless special
support is added. Why didn't they just upgrade chars to 32 bits and be
done with it... :-/

Cheers,
Bart

In response to

Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text at 2006-03-31 19:12:13 from Marc Herbert

Responses

Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text at 2006-04-03 08:55:30 from Marc Herbert

Browse pgsql-odbc by date

	From	Date	Subject
Next Message	Hiroshi Inoue	2006-04-01 01:35:37	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text
Previous Message	Marc Herbert	2006-03-31 19:12:13	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text