Skip site navigation (1) Skip section navigation (2)

Re: like/ilike improvements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: like/ilike improvements
Date: 2007-05-22 17:01:14
Message-ID: 14707.1179853274@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Tom Lane wrote:
>> I thought we'd determined that advancing bytewise for "%" was also 
>> risky, in two cases:
>> 
>> 1. Multibyte character set that is not UTF8 (more specifically, does not
>> have a guarantee that first bytes and not-first bytes are distinct)

> I thought we disposed of the idea that there was a problem with charsets 
> that didn't do first byte special.

We disposed of that in connection with a version of the patch that had
"%" advancing in NextChar units, so that comparison of ordinary
characters was always safely char-aligned.  Consider 2-byte characters
represented as {AB} etc:

	DATA	x{AB}{CD}y

	PATTERN	%{BC}%

If "%" advances by bytes then this will find a spurious match.  The
only thing that prevents it is if "B" can't be both a leading and a
trailing byte of validly-encoded MB characters.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2007-05-22 17:29:23
Subject: Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server
Previous:From: Martijn van OosterhoutDate: 2007-05-22 16:56:10
Subject: Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

pgsql-patches by date

Next:From: Bruce MomjianDate: 2007-05-22 17:15:00
Subject: Re: Synchronized Scan
Previous:From: Andrew DunstanDate: 2007-05-22 16:51:51
Subject: Re: like/ilike improvements

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group