| From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
|---|---|
| To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Cc: | jtvjtv(at)gmail(dot)com |
| Subject: | BUG #19354: JOHAB rejects valid byte sequences |
| Date: | 2025-12-13 18:52:36 |
| Message-ID: | 19354-eefe6d8b3e84f9f2@postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 19354
Logged by: Jeroen Vermeulen
Email address: jtvjtv(at)gmail(dot)com
PostgreSQL version: 18.1
Operating system: Debian unstable x86-64, macOS, Windows, etc.
Description:
Calling libpq, connecting to a UTF8 database and successfully setting client
encoding to JOHAB, this statement:
PQexec(connection, "SELECT '\x8a\x5c'");
Returned an empty result with this error message:
ERROR: invalid byte sequence for encoding "JOHAB": 0x8a 0x5c
AFAICT, 0x8a 0x5c is a valid JOHAB sequence making up Hangul character "굎".
Easily verified in Python:
print(b'\x8a\x5c'.decode('johab'))
It's the same story for some other valid sequences I tried, including this
character's "neighbours" 0x8a 0x5b and 0x8a 0x5d.
My test code did work with similar two-byte characters in BIG5, GB18030,
UTF-8, SJIS, and UHC. It just breaks with these JOHAB characters on all of
these x86-64 docker images: "archlinux", "debian", "debian:unstable",
"fedora", and "ubuntu". And I got the same results on macOS+homebrew,
Windows+MinGW with pacman-installed postgres, and a native Windows VM with
whatever-postgres-they-preinstall.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | T iv | 2025-12-13 19:34:35 | Cluster is not being created |
| Previous Message | Álvaro Herrera | 2025-12-13 12:44:38 | Re: Re: Re: BUG #19351: in pg18.1,when not null exists in the table , and add constraint problem. |