pgsql: Fix comments that claimed that mblen() only looks at first byte.

From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix comments that claimed that mblen() only looks at first byte.
Date: 2019-01-25 12:55:22
Message-ID: E1gn110-0005FJ-2T@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix comments that claimed that mblen() only looks at first byte.

GB18030's mblen() function looks at the first and the second byte of the
multibyte character, to determine its length. copy.c had made the
assumption that mblen() only looks at the first byte, but it turns out to
work out fine, because of the way the GB18030 encoding works. COPY will
see a 4-byte encoded character as two 2-byte encoded characters, which is
enough for COPY's purposes. It cannot mix those up with delimiter or
escaping characters, because only single-byte ASCII characters are
supported as delimiters or escape characters.

Discussion: https://www.postgresql.org/message-id/7704d099-9643-2a55-fb0e-becd64400dcb%40iki.fi

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/a5be6e9a1dfe820807f9ccb21dec5144982618e6

Modified Files
--------------
src/backend/commands/copy.c | 7 ++++++-
src/backend/utils/mb/wchar.c | 32 +++++++++++++++++++++++++-------
2 files changed, 31 insertions(+), 8 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Heikki Linnakangas 2019-01-25 14:29:19 pgsql: Use single-byte Boyer-Moore-Horspool search even with multibyte
Previous Message Peter Eisentraut 2019-01-25 10:37:22 pgsql: Allow generalized expression syntax for partition bounds