| From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
|---|---|
| To: | JiaoShuntian <jiaoshuntian(at)highgo(dot)com> |
| Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: GB18030-2022 Support in PostgreSQL |
| Date: | 2025-08-04 10:35:02 |
| Message-ID: | CANWCAZYzenc5nxx1Wm4dKv9hWbEzsge8FX=q-mtHj8NvhSwQww@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Aug 4, 2025 at 3:08 PM JiaoShuntian <jiaoshuntian(at)highgo(dot)com> wrote:
> I noticed that PostgreSQL currently supports GB18030 encoding based on the older GB18030-2000 standard (as seen in commits like extend GB18030 conversion). However, China has since updated its mandatory character set standard to GB18030-2022, which includes additional characters and stricter compliance requirements.GB18030-2022 is now the official standard in China, and ensuring PostgreSQL’s full compliance would be beneficial for users in Chinese-speaking regions.
This is a non-backwards-compatible change:
https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf
https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf
There is a risk of breaking applications, although only a few dozen
mappings changed. If it were added as a separate encoding, users could
opt in.
--
John Naylor
Amazon Web Services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | vignesh C | 2025-08-04 10:38:28 | Re: Dropping publication breaks logical replication |
| Previous Message | Amit Kapila | 2025-08-04 10:11:20 | Re: Improve pg_sync_replication_slots() to wait for primary to advance |