Quick Links

Re: UTF8MatchText

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: UTF8MatchText
Date:	2007-05-17 17:48:10
Message-ID:	464C955A.6050402@dunslane.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

Tom Lane wrote:
> UTF8 has disjoint representations for
> first-bytes and not-first-bytes of MB characters, and thus it is
> impossible to make a false match in which an MB pattern character is
> matched to the end of one data character plus the start of another.
> In character sets without that property, we have to use the slow way to
> ensure we don't make out-of-sync matches.
>
>
>

Thanks. I will include this info in the comments.

cheers

andrew

In response to

Re: UTF8MatchText at 2007-05-17 17:33:08 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joshua D. Drake	2007-05-17 17:57:29	Re: Patch queue triage
Previous Message	Tom Lane	2007-05-17 17:39:41	Re: UTF8MatchText

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2007-05-17 18:00:35	Re: Seq scans status update
Previous Message	Tom Lane	2007-05-17 17:39:41	Re: UTF8MatchText