From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | John Naylor <johncnaylorls(at)gmail(dot)com>, JiaoShuntian <jiaoshuntian(at)highgo(dot)com(dot)w(dot)kunlunaq(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: GB18030-2022 Support in PostgreSQL |
Date: | 2025-08-04 13:51:01 |
Message-ID: | 851769.1754315461@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 2025-08-04 Mo 6:35 AM, John Naylor wrote:
>> There is a risk of breaking applications, although only a few dozen
>> mappings changed. If it were added as a separate encoding, users could
>> opt in.
> That makes sense ... naming the new encoding so as to avoid confusion
> might be a challenge.
We have precedent for that in SHIFT_JIS_2004. Presumably if we
make this a new encoding, it'd be GB18030_2022.
However, adding a new encoding ID is not without breakage risks
of its own, stemming from some code knowing the new ID and others
not. I recall that we had some actual problems of that ilk when
we added SHIFT_JIS_2004, and some of them were pretty subtle.
See e.g. this comment from src/bin/initdb/Makefile:
# Note: it's important that we link to encnames.o from libpgcommon, not
# from libpq, else we have risks of version skew if we run with a libpq
# shared library from a different PG version. Define
# USE_PRIVATE_ENCODING_FUNCS to ensure that that happens.
That was long enough ago that I have little faith either that that
fix still does what it intended to (the code has been rejiggered
significantly since the issue was last battle-tested), or that
there are not similar hazards elsewhere.
So on the whole I'd lean a bit towards just redefining GB18030 as
meaning the new standard. The fact that we don't support it as a
server-side encoding perhaps makes that idea more tenable than it
would be if the encoding governed the interpretation of our own
stored data.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2025-08-04 13:53:38 | Re: Dropping publication breaks logical replication |
Previous Message | Andrew Dunstan | 2025-08-04 13:09:47 | Re: split func.sgml to separated individual sgml files |