Quick Links

Re: LIKE optimization in UTF-8 and locale-C

From:	Hannu Krosing <hannu(at)skype(dot)net>
To:	andrew(at)supernews(dot)com
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: LIKE optimization in UTF-8 and locale-C
Date:	2007-03-25 18:18:19
Message-ID:	1174846699.3344.8.camel@localhost.localdomain
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

Ühel kenal päeval, R, 2007-03-23 kell 06:10, kirjutas Andrew -
Supernews:
> On 2007-03-23, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
> > Thanks, it all made sense to me. My proposal was completely wrong.
>
> Actually, I think your proposal is fundamentally correct, merely incomplete.
>
> Doing octet-based rather than character-based matching of strings is a
> _design goal_ of UTF8. Treating UTF8 like any other multibyte charset and
> converting everything to wide-chars is, in my opinion, always going to
> result in suboptimal performance.

Yes, that was what I meant by proposing a utf8 specific UTF8MatchText(),
which should not convert everything to wide char, but instead do
byte-by-byte comparison and just be aware of UTF encoding, where it is
easy to know how wide (how maby bytes/octets) each encoded character
takes.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Re: LIKE optimization in UTF-8 and locale-C at 2007-03-23 06:10:39 from Andrew - Supernews

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-03-25 18:20:32	Re: datestyle GUC broken in HEAD?
Previous Message	Tom Lane	2007-03-25 18:08:00	Re: BSD advertizing clause in some files

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Gregory Stark	2007-03-25 18:51:16	Re: Improvement of procArray.xmin for VACUUM
Previous Message	Tom Lane	2007-03-25 17:35:22	Re: Improvement of procArray.xmin for VACUUM