pgsql: Fix encoding length for EUC_CN.

From: Thomas Munro <tmunro(at)postgresql(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix encoding length for EUC_CN.
Date: 2026-02-09 00:07:19
Message-ID: E1vpEoB-001x0W-0S@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix encoding length for EUC_CN.

While EUC_CN supports only 1- and 2-byte sequences (CS0, CS1), the
mb<->wchar conversion functions allow 3-byte sequences beginning SS2,
SS3.

Change pg_encoding_max_length() to return 3, not 2, to close a
hypothesized buffer overrun if a corrupted string is converted to wchar
and back again in a newly allocated buffer. We might reconsider that in
master (ie harmonizing in a different direction), but this change seems
better for the back-branches.

Also change pg_euccn_mblen() to report SS2 and SS3 characters as having
length 3 (following the example of EUC_KR). Even though such characters
would not pass verification, it's remotely possible that invalid bytes
could be used to compute a buffer size for use in wchar conversion.

Security: CVE-2026-2006
Backpatch-through: 14
Author: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Reviewed-by: Noah Misch <noah(at)leadboat(dot)com>
Reviewed-by: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>

Branch
------
REL_18_STABLE

Details
-------
https://git.postgresql.org/pg/commitdiff/df0852fe037246289cc00b4d36da6c1f25ff5844

Modified Files
--------------
src/common/wchar.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Thomas Munro 2026-02-09 00:07:34 pgsql: Code coverage for most pg_mblen* calls.
Previous Message Thomas Munro 2026-02-09 00:06:57 pgsql: Fix mb2wchar functions on short input.