Re: UTF8 encoding and non-text data types

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe <dev(at)freedomcircle(dot)net>
Cc: Medi Montaseri <montaseri(at)gmail(dot)com>, Steve Midgley <public(at)misuse(dot)org>, pgsql-sql(at)postgresql(dot)org
Subject: Re: UTF8 encoding and non-text data types
Date: 2008-01-14 23:24:33
Message-ID: 8278.1200353073@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Joe <dev(at)freedomcircle(dot)net> writes:
> Tom Lane wrote:
>> Well, you've got two problems there. The first and biggest is that
>> &#NNN; is an HTML notation, not a SQL notation; no SQL database is going
>> to think that that string in its input is a representation of a single
>> Unicode character. The other problem is that even if this did happen,
>> code points 1777 and nearby are not digits; they're something or other
>> in Arabic, apparently.
>>
> Precisely. 1777 through 1780 decimal equate to code points U+06F1
> through U+06F4, which correspond to the Arabic numerals 1 through 4.

Oh? Interesting. But even if we wanted to teach Postgres about that,
wouldn't there be a pretty strong risk of getting confused by Arabic's
right-to-left writing direction? Wouldn't be real helpful if the entry
came out as 4321 when the user wanted 1234. Definitely seems like
something that had better be left to the application side, where there's
more context about what the string means.

regards, tom lane

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Joe 2008-01-14 23:40:45 Re: UTF8 encoding and non-text data types
Previous Message Joe 2008-01-14 23:03:17 Re: UTF8 encoding and non-text data types