From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Cc: | Tatsuo Ishii <ishii(at)postgreSQL(dot)org> |
Subject: | Re: Errors in our encoding conversion tables |
Date: | 2015-11-28 20:24:22 |
Message-ID: | 32464.1448742262@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I wrote:
> There's a discussion over at
> http://www.postgresql.org/message-id/flat/2sa(dot)Dhu5(dot)1hk1yrpTNFy(dot)1MLOlb(at)seznam(dot)cz
> of an apparent error in our WIN1250 -> LATIN2 conversion.
Attached is an updated patch (against today's HEAD) showing proposed
changes to bring cyrillic_and_mic.c and latin2_and_win1250.c into sync
with the Unicode Consortium's conversion data.
In addition, I've attached the C program I used to generate the proposed
new conversion tables from the Unicode/*.map files, a simple SQL script
to print out the conversion behavior for the affected conversions, and
a diff of the script's output between 9.5 and the proposed patch.
While the changes in the WIN1250 <-> LATIN2 conversions just amount to
removal of some translations that seem to have no basis in reality, the
changes in the Cyrillic mappings are quite a bit more extensive. It would
be good if we could get those checked by some native Russian speakers.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
encoding-conversion-corrections-2.patch | text/x-diff | 16.4 KB |
buildmap.c | text/x-c | 3.2 KB |
checkconv.sql | text/plain | 2.8 KB |
diffs9.5vspatch | text/x-diff | 59.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Janes | 2015-11-28 20:51:58 | Re: Freeze avoidance of very large table. |
Previous Message | Jeff Janes | 2015-11-28 20:17:25 | Re: Speed up Clog Access by increasing CLOG buffers |