Skip site navigation (1) Skip section navigation (2)

Re: UTF8 encoding and non-text data types

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe <dev(at)freedomcircle(dot)net>
Cc: Medi Montaseri <montaseri(at)gmail(dot)com>, Steve Midgley <public(at)misuse(dot)org>, pgsql-sql(at)postgresql(dot)org
Subject: Re: UTF8 encoding and non-text data types
Date: 2008-01-14 23:24:33
Message-ID: 8278.1200353073@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-sql
Joe <dev(at)freedomcircle(dot)net> writes:
> Tom Lane wrote:
>> Well, you've got two problems there.  The first and biggest is that
>> &#NNN; is an HTML notation, not a SQL notation; no SQL database is going
>> to think that that string in its input is a representation of a single
>> Unicode character.  The other problem is that even if this did happen,
>> code points 1777 and nearby are not digits; they're something or other
>> in Arabic, apparently.
>> 
> Precisely. 1777 through 1780 decimal equate to code points U+06F1 
> through U+06F4, which correspond to the Arabic numerals 1 through 4.

Oh?  Interesting.  But even if we wanted to teach Postgres about that,
wouldn't there be a pretty strong risk of getting confused by Arabic's
right-to-left writing direction?  Wouldn't be real helpful if the entry
came out as 4321 when the user wanted 1234.  Definitely seems like
something that had better be left to the application side, where there's
more context about what the string means.

			regards, tom lane

In response to

Responses

pgsql-sql by date

Next:From: JoeDate: 2008-01-14 23:40:45
Subject: Re: UTF8 encoding and non-text data types
Previous:From: JoeDate: 2008-01-14 23:03:17
Subject: Re: UTF8 encoding and non-text data types

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group