Re: Facing issue in using special characters

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Facing issue in using special characters
Date: 2019-03-18 21:19:23
Message-ID: 20190318211923.eaobabpawdcjm5fh@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-performance

On 2019-03-17 15:01:40 +0000, Warner, Gary, Jr wrote:
> Many of us have faced character encoding issues because we are not in control
> of our input sources and made the common assumption that UTF-8 covers
> everything.

UTF-8 covers "everything" in the sense that there is a round-trip from
each character in every commonly-used charset/encoding to Unicode and
back.

The actual code may of course be different. For example, the € sign is
0xA4 in iso-8859-15, but U+20AC in Unicode. So you need an
encoding/decoding step.

And "commonly-used" means just that. Unicode covers a lot of character
sets, but it can't cover every character set ever invented (I invented
my own character sets when I was sixteen. Nobody except me ever used
them and they have long succumbed to bit rot).

> In my lab, as an example, some of our social media posts have included ZawGyi
> Burmese character sets rather than Unicode Burmese. (Because Myanmar developed
> technology In a closed to the world environment, they made up their own
> non-standard character set which is very common still in Mobile phones.).

I'd be surprised if there was a character set which is "very common in
Mobile phones", even in a relatively poor country like Myanmar. Does
ZawGyi actually include characters which aren't in Unicode are are they
just encoded differently?

hp

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp(at)hjp(dot)at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Gierth 2019-03-18 21:29:06 Re: printing JsonbPair values of input JSONB on server side?
Previous Message T L 2019-03-18 20:59:42 Re: printing JsonbPair values of input JSONB on server side?

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2019-03-18 21:33:02 Re: Progress reporting for pg_verify_checksums
Previous Message Alvaro Herrera 2019-03-18 20:42:35 Re: partitioned tables referenced by FKs

Browse pgsql-performance by date

  From Date Subject
Next Message Sam Gendler 2019-03-19 15:18:10 Re: Distributing data over "spindles" even on AWS EBS, (followup to the work queue saga)
Previous Message Gunther 2019-03-17 18:42:04 Re: Distributing data over "spindles" even on AWS EBS, (followup to the work queue saga)