Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: Zhongpu Chen <chenloveit(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
Date: 2026-05-02 01:25:20
Message-ID: CAEG8a3+ithNTBCsuu88tecUDx+VjABkCoOjuKyxgMX29hSTX-g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, May 2, 2026 at 12:09 AM Zhongpu Chen <chenloveit(at)gmail(dot)com> wrote:
>
>
> ```
> demo_euc_cn_db=# SET client_encoding TO 'EUC_CN';
> SET
> demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
> id | s
> ----+----
> 1 | ��
> (1 row)
> ```
>
> Since 0xA2A3 is invalid in EUC-CN, it cannot be mapped to any meaningful character. Currently, EUC-CN allows all 2-byte within A1-EF, but this coarse-grained approach is flawed.

This seems more like a feature request than a bug. It would make sense
to close the bug report and start a discussion on the hackers mailing
list instead.

>
> On Fri, May 1, 2026 at 11:07 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
>>
>> On Fri, May 1, 2026 at 9:59 PM Zhongpu Chen <chenloveit(at)gmail(dot)com> wrote:
>> >
>> > ## Description
>> >
>> > The legacy encodings allow some invalid bytes, which will cause errors during SELECT operations.
>> >
>> > ## How to reproduce
>> >
>> > ```shell
>> > createdb -E EUC_CN -T template0 --locale=C demo_euc_cn_db
>> > ```
>> >
>> > ```sql
>> > demo_euc_cn_db=# CREATE TABLE t(id int, s varchar(10));
>> >
>> > demo_euc_cn_db=# INSERT INTO t VALUES(1, E'\xA2\xA3');
>> > INSERT 0 1
>> > demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
>> > ERROR: character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
>>
>> Can you try the following statement before select?
>> SET client_encoding TO 'EUC_CN';
>>
>> > ```
>> >
>> > --
>> > Zhongpu Chen
>>
>>
>>
>> --
>> Regards
>> Junwang Zhao
>
>
>
> --
> Zhongpu Chen

--
Regards
Junwang Zhao

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2026-05-03 18:17:19 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
Previous Message Andres Freund 2026-05-01 19:41:05 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum