Skip site navigation (1) Skip section navigation (2)

Multibyte LIKE optimization

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-patches(at)postgresql(dot)org, andrew+nonews(at)supernews(dot)com
Subject: Multibyte LIKE optimization
Date: 2007-03-30 08:40:08
Message-ID: 20070330142456.77F9.ITAGAKI.TAKAHIRO@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Andrew - Supernews <andrew+nonews(at)supernews(dot)com> wrote:

> Actually, I think your proposal is fundamentally correct, merely incomplete.

Yeah, I fixed the patch to handle '_' correctly.

> Doing octet-based rather than character-based matching of strings is a
> _design goal_ of UTF8.

I think all "safe ASCII-supersets" encodings are comparable by bytes,
not only UTF-8. Their all multibyte characters consist of bytes larger
than 127. I updated the patch on this presupposition. It uses octet-based
matching usually and character-based matching at '_'.

There was 30%+ of performance win in selection using multibytes LIKE '%foo%'.

 encoding  |  HEAD   | patched
-----------+---------+---------
 SQL_ASCII |  7094ms |  7062ms
 LATIN1    |  7083ms |  7078ms
 UTF8      | 17974ms | 11635ms (64.7%)
 EUC_JP    | 17032ms | 12109ms (71.1%)


If this patch is acceptable, please drop JOHAB encoding from server encodings
before it is applied. Trailing bytes of JOHAB can be less than 128.
http://archives.postgresql.org/pgsql-hackers/2007-03/msg01475.php

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


Attachment: mbtextmatch.patch
Description: application/octet-stream (6.7 KB)

In response to

Responses

pgsql-hackers by date

Next:From: ITAGAKI TakahiroDate: 2007-03-30 08:59:56
Subject: Dead Space Map version 3 (simplified)
Previous:From: Zeugswetter Andreas ADI SDDate: 2007-03-30 08:22:23
Subject: Re: [PATCHES] Full page writes improvement, code update

pgsql-patches by date

Next:From: ITAGAKI TakahiroDate: 2007-03-30 08:59:56
Subject: Dead Space Map version 3 (simplified)
Previous:From: Zeugswetter Andreas ADI SDDate: 2007-03-30 08:22:23
Subject: Re: [PATCHES] Full page writes improvement, code update

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group