| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
| Cc: | assam258(at)gmail(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeroen Vermeulen <jtvjtv(at)gmail(dot)com>, VASUKI M <vasukianand0119(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: BUG #19354: JOHAB rejects valid byte sequences |
| Date: | 2026-04-15 02:06:18 |
| Message-ID: | 1910469.1776218778@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Wed, Apr 15, 2026 at 1:20 PM Henson Choi <assam258(at)gmail(dot)com> wrote:
>> I understand the appeal of simply deleting a dead-looking encoding,
>> and Thomas' removal patch is clean work. However, Korean archival
>> data from the 1990s (government records, academic repositories, early
>> online corpora) does exist as JOHAB bytes; as a client encoding, JOHAB
>> in PostgreSQL provides a straightforward ingest path
>> (client_encoding=JOHAB, convert_from, then store as UTF-8). Once
>> removed, that path closes with no obvious alternative short of
>> preprocessing outside PostgreSQL. Fixing the verifier preserves the
>> capability at the cost of a ~30-line correction plus tests.
> The counter argument would be that you could use iconv
> --from-code=JOHAB ..., or libiconv, or the codecs available in Python,
> Java, etc for dealing with historical archived data, something that
> data archivists must be very aware of.
Sure. But it's not comfortable to remove a user-visible feature
we've had for decades. My own primary concern about it was that a
correct fix could require non-backwards-compatible behavior changes.
Henson's analysis says that that's not a problem. So assuming this
patch withstands review, I'd be much happier to see it applied than
to remove JOHAB.
No opinion at the moment about whether to back-patch.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Henson Choi | 2026-04-15 04:25:04 | Re: BUG #19354: JOHAB rejects valid byte sequences |
| Previous Message | Thomas Munro | 2026-04-15 01:49:24 | Re: BUG #19354: JOHAB rejects valid byte sequences |