Re: UNICODE characters above 0x10000

From: "John Hansen" <john(at)geeknet(dot)com(dot)au>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: UNICODE characters above 0x10000
Date: 2004-08-07 03:04:21
Message-ID: 5066E5A966339E42AA04BA10BA706AE56085@rodrick.geeknet.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

My apologies for not reading the code properly.

Attached patch using pg_utf_mblen() instead of an indexed table.
It now also do bounds checks.

Regards,

John Hansen

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Saturday, August 07, 2004 4:37 AM
To: John Hansen
Cc: Hackers; Patches
Subject: Re: [HACKERS] UNICODE characters above 0x10000

"John Hansen" <john(at)geeknet(dot)com(dot)au> writes:
> Attached, as promised, small patch removing the limitation, adding
> correct utf8 validation.

Surely this is badly broken --- it will happily access data outside the
bounds of the given string. Also, doesn't pg_mblen already know the
length rules for UTF8? Why are you duplicating that knowledge?

regards, tom lane

Attachment Content-Type Size
wchar.c.patch application/octet-stream 2.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2004-08-07 03:08:54 Re: Vacuum Cost Documentation?
Previous Message Tom Lane 2004-08-07 02:56:49 Re: [PATCHES] [BUGS] casting strings to multidimensional arrays yields strange

Browse pgsql-patches by date

  From Date Subject
Next Message Joe Conway 2004-08-07 03:19:35 Re: [PATCHES] [BUGS] casting strings to multidimensional arrays yields
Previous Message Bruce Momjian 2004-08-07 02:58:54 Re: Autovacuum Integration Patch Take 5