pgsql: Add direct conversion routines between EUC_TW and Big5.

From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Add direct conversion routines between EUC_TW and Big5.
Date: 2021-01-28 12:56:19
Message-ID: E1l56qR-0005Mf-4n@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Add direct conversion routines between EUC_TW and Big5.

Conversions between EUC_TW and Big5 were previously implemented by
converting the whole input to MIC first, and then from MIC to the target
encoding. Implement functions to convert directly between the two.

The reason to do this now is that I'm working on a patch that will change
the conversion function signature so that if the input is invalid, we
convert as much as we can and return the number of bytes successfully
converted. That's not possible if we use an intermediary format, because
if an error happens in the intermediary -> final conversion, we lose track
of the location of the invalid character in the original input. Avoiding
the intermediate step makes the conversions faster, too.

Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/b9e3167f-f84b-7aa4-5738-be578a4db924%40iki.fi

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/6c5576075b0f93f2235ac8a82290fe3b6e82300d

Modified Files
--------------
.../euc_tw_and_big5/euc_tw_and_big5.c | 144 +++++++++++++++++++--
1 file changed, 134 insertions(+), 10 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2021-01-28 13:42:08 pgsql: Remove gratuitous uses of deprecated SELECT INTO
Previous Message Heikki Linnakangas 2021-01-28 12:41:09 pgsql: Add mbverifystr() functions specific to each encoding.