Skip site navigation (1) Skip section navigation (2)

Re: LIKE optimization and locale

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LIKE optimization and locale
Date: 2000-11-26 18:16:58
Message-ID: 5047.975262618@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> The real
> problem lies with multi-character collating elements, context dependent
> collation order, multi-pass sorting algorithms, etc.  I'm almost convinced
> that it is not possible to do any such optimization as we had for the most
> general case.

It's not impossible, surely.  The question is whether it's practical,
particularly from a manpower and maintenance standpoint.  We do not
want to build something that needs explicit knowledge of a ton of
locale-specific special cases.

The core problem is: given a string "foo", find a string "fop" that
is greater than any possible extension "foobar" of "foo".  We need
not find the least such string (else it would indeed be a hard
problem), just a reasonably close upper bound.  The algorithm we have
in 7.0.* increments the last byte(s) of "foo" until it finds
something greater than "foo".  That handles collation orders that are
different from numerical order, but it still breaks down in the cases
Peter mentions.

One variant I've been wondering about is to test a candidate bound
string against not only "foo", but all single-character extensions of
"foo", ie, "foo\001" through "foo\255".  That would catch situations
like the one most recently complained of, where the last character
of the proposed bound string is just a noise-character in dictionary
order.  But I'm afraid it's still not good enough to catch all cases
... and it doesn't generalize to MULTIBYTE very well anyway.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Peter EisentrautDate: 2000-11-26 18:21:58
Subject: Re: tcl/FreeBSD 4.2-STABLE, multiple TCL versions installed
Previous:From: Oleg BartunovDate: 2000-11-26 14:55:58
Subject: RE: [HACKERS] Indexing for geographic objects?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group