Re: An idea on faster CHAR field indexing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Randall Parker" <randall(at)nls(dot)net>
Cc: "Giles Lean" <giles(at)nemeton(dot)com(dot)au>, "PostgreSQL-Dev" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: An idea on faster CHAR field indexing
Date: 2000-06-22 03:01:16
Message-ID: 7014.961642876@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Randall Parker" <randall(at)nls(dot)net> writes:
> On Thu, 22 Jun 2000 11:12:54 +1000, Giles Lean wrote:
>> Yes. Some locales want strings to be ordered first by ignoring any
>> accents on chracters, then using a tie-break on equal strings by doing
>> a comparison that includes the accents.

> I guess I don't see how this is really any different. Why order first
> by the character and second by the accent? For instance, if you know
> the relative order of the various forms of "o" then just give them all
> successive numbers and do a single pass sort. You just have to make
> sure that all the numbers in that set of numbers are greater than the
> number you assign to "m" and less than the number you assign to "p".

Nope. Would it were that easy. I don't have a keyboard that will
let me type a proper example, but consider

1. a o-with-type-1-accent c

2. a o-with-type-2-accent b

If type-1 accent sorts before type-2 then your proposal will consider
string 1 less than string 2. But the correct answer (in these locales)
is the other way round, because you mustn't look at the accents at all
unless you discover that the strings are otherwise equal. The
determining comparison here is that b < c, therefore string 2 < string 1.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-06-22 03:03:03 Re: Big 7.1 open items
Previous Message Bruce Momjian 2000-06-22 02:29:42 Re: Big 7.1 open items