From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net> |
Subject: | Re: GB18030-2022 Support in PostgreSQL |
Date: | 2025-08-11 09:15:00 |
Message-ID: | CANWCAZY=6qq3obTNyUXq3gVa2an8mrC4c4SoDSi44NB9T0osOw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 11, 2025 at 3:22 PM Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
Hi,
For future reference, please don't quote my entire message below yours
-- it clutters the archives and also removes context.
> Yes, I did a diff between 2000.ucm and 2022.ucm when I worked on the patch. The diff between 2000.ucm and 2022.ucm are quite small:
That would match my expectation. In case it wasn't clear before, my
preference is to split this patch into two patches: First convert to
.ucm, then update to 2022 revision. Then the small diff will be
obvious to everyone who looks at the second commit.
> For your question:
>
> "9 characters are no longer required by the new standard, but are
> retained in this patch for compatibility"
>
> How is that done?
>
>
> The 9 mappings are not changed between 2000.ucm and 2022.ucm. For example, GB18030 code 0xFD9C is one of the 9 not-required code, but the mapping:
>
> <UF92C> \xFD\x9C |0
>
> Still appears in 2022.ucm, so that this character is retained.
Thanks for clarifying -- by saying "retained in the patch", the commit
message implied to me that the patch added something not in the
upstream file.
--
John Naylor
Amazon Web Services
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-08-11 09:25:07 | Re: GB18030-2022 Support in PostgreSQL |
Previous Message | Zhijie Hou (Fujitsu) | 2025-08-11 09:10:40 | RE: Conflict detection for update_deleted in logical replication |