From: | 荒井元成 <n2029(at)ndensan(dot)co(dot)jp> |
---|---|
To: | "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Holger Jakobs'" <holger(at)jakobs(dot)com> |
Cc: | <pgsql-admin(at)lists(dot)postgresql(dot)org> |
Subject: | RE: About Unicode IVS |
Date: | 2022-03-29 11:03:45 |
Message-ID: | 013501d8435c$a8f1c9e0$fad55da0$@ndensan.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
thank you for your reply.
In SQL Server, the variant character selector is treated as one character with two characters. The collation order is Japanese_XJIS_140_CS_AS_KS_WS_VSS_UTF8.
Moto.
-----Original Message-----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Tuesday, March 29, 2022 7:26 PM
To: Holger Jakobs <holger(at)jakobs(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org; n2029(at)ndensan(dot)co(dot)jp
Subject: Re: About Unicode IVS
Holger Jakobs <holger(at)jakobs(dot)com> writes:
> It's totally correct that the two characters are still two characters.
> You would have to normalize the string first, so that the combination
> becomes one character.
Yeah. In principle the normalize() function ought to do this for you. But it doesn't seem to shorten the given example for me; I'm not sure if that means the example is incorrect, or if it's a bug in normalize().
u8=# select octet_length(U&'\+008FBA' || U&'\+0E0102'); octet_length
--------------
7
(1 row)
u8=# select octet_length(normalize(U&'\+008FBA' || U&'\+0E0102')); octet_length
--------------
7
(1 row)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | 荒井元成 | 2022-03-30 00:06:06 | RE: Re: About Unicode IVS |
Previous Message | Tom Lane | 2022-03-29 10:25:56 | Re: About Unicode IVS |