Re: LIKE optimization in UTF-8 and locale-C

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject: Re: LIKE optimization in UTF-8 and locale-C
Date: 2007-03-22 20:11:09
Message-ID: 1174594269.3826.6.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, N, 2007-03-22 kell 11:08, kirjutas Tom Lane:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > I found LIKE operators are slower on multi-byte encoding databases
> > than single-byte encoding ones. It comes from difference between
> > MatchText() and MBMatchText().
>
> > We've had an optimization for single-byte encodings using
> > pg_database_encoding_max_length() == 1 test. I'll propose to extend it
> > in UTF-8 with locale-C case.
>
> If this works for UTF8, won't it work for all the backend-legal
> encodings?

I guess it works well for % but not for _ , the latter has to know, how
many bytes the current (multibyte) character covers.

The length is still easy to find out for UTF8 encoding, so it may be
feasible to write UTF8MatchText() that is still faster than
MBMatchText().

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-03-22 20:15:24 Re: xpath_array with namespaces support
Previous Message Bruce Momjian 2007-03-22 19:53:44 Re: [HACKERS] Stats processor not restarting

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2007-03-22 20:15:24 Re: xpath_array with namespaces support
Previous Message Bruce Momjian 2007-03-22 20:03:28 Re: vacuumdb cancel handler