Re: [HACKERS] like/ilike improvements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] like/ilike improvements
Date: 2007-06-01 22:58:18
Message-ID: 25310.1180738698@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> ITAGAKI Takahiro wrote:
>> | SQL_ASCII | LATIN1 | UTF8 | EUC_JP
>> ---------+-----------+--------+-------+---------
>> HEAD | 8017 | 8029 | 16928 | 18213
>> Patched | 7899 | 7887 | 9985 | 10370 [ms]
>>
>> It improved the performance not only for UTF8, but also for other
>> multi-byte encodings and a bit for single-byte encodings.

> Interesting. I infer from these results that the biggest bang here comes
> from abandoning CHAREQ and doing all comparisons byte-wise.

It looks like CHAREQ and NextChar are both pretty expensive, no doubt
due to having to drill down through the MB encoding vectoring mechanism
to find out what to do.

A technique we might want to apply in future patches is to have an API
whereby we can get a direct function pointer to the appropriate mblen
or other encoding-dependent function, and then call directly to the
right place in the inner loops instead of having to go through the
intermediate vectoring function every time.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2007-06-01 23:42:24 Re: Ye olde drop-the-database-you-just-left problem
Previous Message Tom Lane 2007-06-01 22:54:04 Re: [HACKERS] like/ilike improvements

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-06-02 02:05:32 Re: [HACKERS] like/ilike improvements
Previous Message Tom Lane 2007-06-01 22:54:04 Re: [HACKERS] like/ilike improvements