| From: | Vladimir Valikaev <vladimir(at)4vrs(dot)com> |
|---|---|
| To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Cc: | Victor Sudakov <vas(at)4vrs(dot)com>, Eugene Nasonkin <niksa(at)4vrs(dot)com> |
| Subject: | BugReport: PostgreSQL 17.8. Processing UTF8 encoded strings |
| Date: | 2026-02-25 06:50:44 |
| Message-ID: | 9e005eef-a5dc-4ca3-8589-d7836c459e4d@4vrs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Greetings,
After updating PostgreSQL from version 17.7 to 17.8, we encountered a
problem when extracting a substring from a UTF8 encoded string:
*ERROR: invalid byte sequence for encoding "UTF8": 0xe2*
*
*
_Server:_
Linux i-db-sandbox1.4vrs.com 6.1.0-43-cloud-amd64 #1 SMP PREEMPT_DYNAMIC
Debian 6.1.162-1 (2026-02-08) x86_64 GNU/Linux
$ cat /etc/apt/sources.list.d/pgdg.list
deb http://apt.postgresql.org/pub/repos/apt/ bookworm-pgdg main
deb-src http://apt.postgresql.org/pub/repos/apt/ bookworm-pgdg main
deb https://apt-archive.postgresql.org/pub/repos/apt
bookworm-pgdg-archive main
PostgreSQL 17.8 (Debian 17.8-1.pgdg12+1) on x86_64-pc-linux-gnu,
compiled by gcc (Debian 12.2.0-14+deb12u1) 12.2.0, 64-bit
_Steps to reproduce (psql):_
db_fev:vladimir(at)i-db-sandbox1 => create table test123(id integer, m text);
CREATE TABLE
db_fev:vladimir(at)i-db-sandbox1 => insert into test123 (id,m) values (1,
repeat('a', 1027)||E'\xe2\x80\x8d'||repeat('a', 1027));
INSERT 0 1
db_fev:vladimir(at)i-db-sandbox1 => select length(SUBSTRING(m from 1 for
256)) from test123;
*ERROR: invalid byte sequence for encoding "UTF8": 0xe2*
db_fev:vladimir(at)i-db-sandbox1 => select length(SUBSTRING(substring(m
from 1 for length(m)) from 1 for 256)) from test123;
length
--------
256
(1 row)
_Database db_feb:_
Name | Encoding | Locale Provider | LC_COLLATE | LC_CTYPE |
Locale | ICU Rules |
------------+-----------+-----------------+------------+----------+--------+-----------+
db_feb | UTF8 | libc | C | C | [NULL] |
[NULL] |
The problem does not appear on PostgreSQL 17.7. Also, the problem does
not occur if the string is fully loaded into memory:
db_feb:vladimir(at)i-db-sandbox1 =# select length(SUBSTRING(*substring(m
from 1 for length(m))* from 1 for 256)) from test123;
length
--------
256
(1 row)
The bugreport has also been sent to bugs(at)postgrespro(dot)ru
--
Best Regards,
Vladimir Valikaev
Streamline - Property Management Software
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Anthonin Bonnefoy | 2026-02-25 09:08:32 | Re: BUG #18985: fast shutdown does not close connections from qlik data gateway data movement aka. replicate |
| Previous Message | PG Bug reporting form | 2026-02-25 06:22:57 | BUG #19415: Spelling error about 'vacuume' in zh_CN.po |