| From: | Thomas Munro <tmunro(at)postgresql(dot)org> |
|---|---|
| To: | pgsql-committers(at)lists(dot)postgresql(dot)org |
| Subject: | pgsql: Replace pg_mblen() with bounds-checked versions. |
| Date: | 2026-02-09 00:07:46 |
| Message-ID: | E1vpEoc-001x3U-14@gemulon.postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers |
Replace pg_mblen() with bounds-checked versions.
A corrupted string could cause code that iterates with pg_mblen() to
overrun its buffer. Fix, by converting all callers to one of the
following:
1. Callers with a null-terminated string now use pg_mblen_cstr(), which
raises an "illegal byte sequence" error if it finds a terminator in the
middle of the sequence.
2. Callers with a length or end pointer now use either
pg_mblen_with_len() or pg_mblen_range(), for the same effect, depending
on which of the two seems more convenient at each site.
3. A small number of cases pre-validate a string, and can use
pg_mblen_unbounded().
The traditional pg_mblen() function and COPYCHAR macro still exist for
backward compatibility, but are no longer used by core code and are
hereby deprecated. The same applies to the t_isXXX() functions.
Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Co-authored-by: Noah Misch <noah(at)leadboat(dot)com>
Reviewed-by: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)
Branch
------
REL_16_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/d837fb02925561091a70c5a6a74f42da57a022f9
Modified Files
--------------
contrib/btree_gist/btree_utils_var.c | 21 +++--
contrib/dict_xsyn/dict_xsyn.c | 8 +-
contrib/hstore/hstore_io.c | 2 +-
contrib/ltree/lquery_op.c | 4 +-
contrib/ltree/ltree.h | 2 +-
contrib/ltree/ltree_io.c | 16 ++--
contrib/ltree/ltxtquery_io.c | 4 +-
contrib/pageinspect/heapfuncs.c | 2 +-
contrib/pg_trgm/trgm.h | 4 +-
contrib/pg_trgm/trgm_op.c | 48 ++++++----
contrib/pg_trgm/trgm_regexp.c | 21 +++--
contrib/unaccent/unaccent.c | 7 +-
src/backend/catalog/pg_proc.c | 2 +-
src/backend/tsearch/dict_synonym.c | 8 +-
src/backend/tsearch/dict_thesaurus.c | 18 ++--
src/backend/tsearch/regis.c | 37 ++++----
src/backend/tsearch/spell.c | 123 ++++++++++++-------------
src/backend/tsearch/ts_locale.c | 109 ++++++++--------------
src/backend/tsearch/ts_utils.c | 4 +-
src/backend/tsearch/wparser_def.c | 3 +-
src/backend/utils/adt/encode.c | 6 +-
src/backend/utils/adt/formatting.c | 22 ++---
src/backend/utils/adt/jsonfuncs.c | 2 +-
src/backend/utils/adt/jsonpath_gram.y | 3 +-
src/backend/utils/adt/levenshtein.c | 14 +--
src/backend/utils/adt/like.c | 18 ++--
src/backend/utils/adt/like_match.c | 3 +-
src/backend/utils/adt/oracle_compat.c | 33 ++++---
src/backend/utils/adt/regexp.c | 9 +-
src/backend/utils/adt/tsquery.c | 25 +++---
src/backend/utils/adt/tsvector.c | 11 +--
src/backend/utils/adt/tsvector_op.c | 10 ++-
src/backend/utils/adt/tsvector_parser.c | 29 +++---
src/backend/utils/adt/varbit.c | 8 +-
src/backend/utils/adt/varlena.c | 34 ++++---
src/backend/utils/adt/xml.c | 11 ++-
src/backend/utils/mb/mbutils.c | 150 +++++++++++++++++++++++++++++--
src/include/mb/pg_wchar.h | 7 ++
src/include/tsearch/ts_locale.h | 36 ++++++--
src/include/tsearch/ts_utils.h | 14 ++-
src/test/modules/test_regex/test_regex.c | 3 +-
41 files changed, 532 insertions(+), 359 deletions(-)
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Munro | 2026-02-09 00:07:59 | pgsql: Fix mb2wchar functions on short input. |
| Previous Message | Thomas Munro | 2026-02-09 00:07:34 | pgsql: Code coverage for most pg_mblen* calls. |