From: | "Warner, Gary, Jr" <gar(at)uab(dot)edu> |
---|---|
To: | M Tarkeshwar Rao <m(dot)tarkeshwar(dot)rao(at)ericsson(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, "pgsql-hackers-owner(at)postgresql(dot)org" <pgsql-hackers-owner(at)postgresql(dot)org> |
Subject: | Re: Facing issue in using special characters |
Date: | 2019-03-17 15:01:40 |
Message-ID: | B446C5BC-7195-4BA0-80E6-A15D5CBDF365@uab.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers pgsql-performance |
Many of us have faced character encoding issues because we are not in control of our input sources and made the common assumption that UTF-8 covers everything.
In my lab, as an example, some of our social media posts have included ZawGyi Burmese character sets rather than Unicode Burmese. (Because Myanmar developed technology In a closed to the world environment, they made up their own non-standard character set which is very common still in Mobile phones.). We had fully tested the app with Unicode Burmese, but honestly didn’t know ZawGyi was even a thing that we would see in our dataset. We’ve also had problems with non-Unicode word separators in Arabic.
What we’ve found to be helpful is to view the troubling code in a hex editor and determine what non-standard characters may be causing the problem.
It may be some data conversion is necessary before insertion. But the first step is knowing WHICH characters are causing the issue.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2019-03-17 15:38:41 | Re: How to parse XML in Postgres newer versions also |
Previous Message | Andrus | 2019-03-17 14:19:33 | Re: How to parse XML in Postgres newer versions also |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-03-17 15:03:25 | Re: jsonpath |
Previous Message | Tom Lane | 2019-03-17 14:22:50 | Re: CREATE OR REPLACE AGGREGATE? |
From | Date | Subject | |
---|---|---|---|
Next Message | Gunther | 2019-03-17 18:42:04 | Re: Distributing data over "spindles" even on AWS EBS, (followup to the work queue saga) |
Previous Message | Rory Campbell-Lange | 2019-03-16 18:58:55 | MDRaid or LSI MegaRAID? |