Re: GB18030-2022 Support in PostgreSQL

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: GB18030-2022 Support in PostgreSQL
Date: 2025-09-24 09:18:40
Message-ID: CAEoWx2m0d-DGC+VRkq8O_cZLR_z=o_BP5p6exV5hc3C8JiNOJg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sep 24, 2025, at 15:04, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:

On Sep 24, 2025, at 14:42, John Naylor <johncnaylorls(at)gmail(dot)com> wrote:

Sounds good. Were you also interested in seeing if EUC_CN can use the
same UCM file? That would allow us to get rid of the XML file.

Sure, let me take a look.

I found that both EUC_CN and UHC use the same XML file, so I updated both.

I didn’t delete gb-18030-2000.xml in this patch, because it would make the
patch file very large, you can just add the deletion to the commit when you
push it.

Basically, the changes are all borrowed from the previous commit. With this
patch, regenerating the maps file lead to no map file change, which is
expected:

```
% make utf8_to_uhc.map utf8_to_euc_cn.map
'/usr/bin/perl' -I . UCS_to_UHC.pl
- Writing UTF8=>UHC conversion table: utf8_to_uhc.map
- Writing UHC=>UTF8 conversion table: uhc_to_utf8.map
'/usr/bin/perl' -I . UCS_to_EUC_CN.pl
- Writing UTF8=>EUC_CN conversion table: utf8_to_euc_cn.map
- Writing EUC_CN=>UTF8 conversion table: euc_cn_to_utf8.map

% git diff # no map file change
%
```

I am not sure if you should also upgrade the UCM file to 2022 version, but
if we need, let’s do it with a separate commit.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment Content-Type Size
v1-0001-Generate-EUC_CN-and-UHC-mappings-from-the-Unicode.patch application/octet-stream 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wenhui qiu 2025-09-24 09:19:41 Re: Inconsistent Behavior of GROUP BY ROLLUP in v17 vs master
Previous Message shveta malik 2025-09-24 09:07:56 Re: Report bytes and transactions actually sent downtream