Re: like/ilike improvements

From: "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: <andrew(at)supernews(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: like/ilike improvements
Date: 2007-05-25 08:16:59
Message-ID: E1539E0ED7043848906A8FF995BDA579021B259E@m0143.s-mxs.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


> > However, I have just about convinced myself that we don't need
> > IsFirstByte for matching "_" for UTF8, either preceded by "%" or
not,
> > as it should always be true. Can anyone come up with a counter
example?
>
> You have to be on a first byte before you can meaningfully
> apply NextChar, and you have to use NextChar or else you
> don't count characters correctly (eg "__" must match 2 chars
> not 2 bytes).

Well, for utf8 NextChar could advance to the next char even if the
current byte
position is in the middle of a multibyte char (skip over all 10xxxxxx).

(Assuming utf16 surrogate pairs are not encoded as 2 x 3bytes, which is
not valid utf8 anyway)

Andreas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2007-05-25 10:49:29 Re: Reviewing temp_tablespaces GUC patch
Previous Message Guillaume Smet 2007-05-25 07:36:38 Re: Why not keeping positions in GIN?

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-05-25 10:55:32 Re: like/ilike improvements
Previous Message mark 2007-05-25 05:20:16 Re: like/ilike improvements