Re: UTF8MatchText

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject: Re: UTF8MatchText
Date: 2007-05-17 18:16:51
Message-ID: 4800.1179425811@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> Wait a second ... I just thought of a counterexample that destroys the
>> entire concept. Consider the pattern 'A__B', which clearly is supposed
>> to match strings of four *characters*. With the proposed patch in
>> place, it would match strings of four *bytes*. Which is not the correct
>> behavior.

> From what I can see the code is quite careful about when it calls
> NextByte vs NextChar, and after _ it calls NextChar.

Except that the entire point of this patch is to dumb down NextChar to
be the same as NextByte for UTF8 strings.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-05-17 18:36:50 Re: UTF8MatchText
Previous Message Andrew Dunstan 2007-05-17 18:06:08 Re: UTF8MatchText

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-05-17 18:36:50 Re: UTF8MatchText
Previous Message Andrew Dunstan 2007-05-17 18:06:08 Re: UTF8MatchText