Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16

From: Noah Misch <noah(at)leadboat(dot)com>
To: ranvis(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: thomas(dot)munro(at)gmail(dot)com
Subject: Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16
Date: 2026-02-14 05:38:21
Message-ID: 20260214053821.fa.noahmisch@microsoft.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Feb 13, 2026 at 04:21:13PM -0800, Noah Misch wrote:
> Review welcome. I have a Valgrind test run ongoing.

Valgrind found the complaint below, but I think this an instrumentation
problem. I've added a fix for that instrumentation. I also made minor edits
to the log message of the main patch, hence v3.

The release team is preparing to announce a 2026-02-26 out-of-cycle release in
light of this regression. I plan to push these fixes at 2026-02-14T20:00+0000
to unblock that formal announcement.

==00:00:01:11.756 3464664== VALGRINDERROR-BEGIN
==00:00:01:11.756 3464664== Unaddressable byte(s) found during client check request
==00:00:01:11.769 3464664== at 0xC6076A: pg_mblen_with_len (mbutils.c:1115)
==00:00:01:11.769 3464664== by 0xC07CD3: pg_mbcharcliplen_chars (varlena.c:807)
==00:00:01:11.770 3464664== by 0xC07AAD: text_substring (varlena.c:732)
==00:00:01:11.770 3464664== by 0xC07797: text_substr (varlena.c:553)
==00:00:01:11.770 3464664== by 0x779688: ExecInterpExpr (execExprInterp.c:953)
==00:00:01:11.770 3464664== by 0x77BBD6: ExecInterpExprStillValid (execExprInterp.c:2299)
==00:00:01:11.770 3464664== by 0x7DBBC4: ExecEvalExprNoReturn (executor.h:423)
==00:00:01:11.770 3464664== by 0x7DBC73: ExecEvalExprNoReturnSwitchContext (executor.h:464)
==00:00:01:11.770 3464664== by 0x7DBCD3: ExecProject (executor.h:496)
==00:00:01:11.770 3464664== by 0x7DC134: ExecScanExtended (execScan.h:234)
==00:00:01:11.770 3464664== by 0x7DC474: ExecSeqScanWithProject (nodeSeqscan.c:162)
==00:00:01:11.770 3464664== by 0x794561: ExecProcNodeFirst (execProcnode.c:469)
==00:00:01:11.770 3464664== Address 0x19340a5e is 12,062 bytes inside a block of size 12,064 alloc'd
==00:00:01:11.770 3464664== at 0x4C29F73: malloc (vg_replace_malloc.c:309)
==00:00:01:11.770 3464664== by 0xC79F03: AllocSetAllocLarge (aset.c:756)
==00:00:01:11.770 3464664== by 0xC7AA4C: AllocSetAlloc (aset.c:1033)
==00:00:01:11.770 3464664== by 0xC8B37E: palloc (mcxt.c:1408)
==00:00:01:11.770 3464664== by 0x4B0182: detoast_attr_slice (detoast.c:324)
==00:00:01:11.770 3464664== by 0xC5177F: pg_detoast_datum_slice (fmgr.c:1825)
==00:00:01:11.770 3464664== by 0xC07A11: text_substring (varlena.c:716)
==00:00:01:11.770 3464664== by 0xC07797: text_substr (varlena.c:553)
==00:00:01:11.770 3464664== by 0x779688: ExecInterpExpr (execExprInterp.c:953)
==00:00:01:11.770 3464664== by 0x77BBD6: ExecInterpExprStillValid (execExprInterp.c:2299)
==00:00:01:11.770 3464664== by 0x7DBBC4: ExecEvalExprNoReturn (executor.h:423)
==00:00:01:11.770 3464664== by 0x7DBC73: ExecEvalExprNoReturnSwitchContext (executor.h:464)
==00:00:01:11.770 3464664==
==00:00:01:11.770 3464664== VALGRINDERROR-END
{
<insert_a_suppression_name_here>
Memcheck:User
fun:pg_mblen_with_len
fun:pg_mbcharcliplen_chars
fun:text_substring
fun:text_substr
fun:ExecInterpExpr
fun:ExecInterpExprStillValid
fun:ExecEvalExprNoReturn
fun:ExecEvalExprNoReturnSwitchContext
fun:ExecProject
fun:ExecScanExtended
fun:ExecSeqScanWithProject
fun:ExecProcNodeFirst
}
2026-02-13 21:03:38.905 PST client backend[3464664] pg_regress/encoding ERROR: invalid byte sequence for encoding "UTF8": 0xe2 0x80
2026-02-13 21:03:38.905 PST client backend[3464664] pg_regress/encoding STATEMENT: SELECT SUBSTRING(c FROM 4001 FOR 1) FROM toast_3b_utf8;
**00:00:01:11.771 3464664** Valgrind detected 1 error(s) during execution of "SELECT SUBSTRING(c FROM 4001 FOR 1) FROM toast_3b_utf8;"

Attachment Content-Type Size
toast-slice-mblen-v3.patch text/plain 8.7 KB
mblen-valgrind-after-report-v1.patch text/plain 1.7 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2026-02-14 08:15:45 Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16
Previous Message Tom Lane 2026-02-14 01:11:01 Re: BUG #19407: pg_dump : DROP RULE creates forward references