From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Artur Rataj <arataj(at)iitis(dot)gliwice(dot)pl> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Comparing strings with non-ASCII characters |
Date: | 2002-11-13 14:58:22 |
Message-ID: | 21525.1037199502@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Artur Rataj <arataj(at)iitis(dot)gliwice(dot)pl> writes:
> Anyway, why the index optimizations are disabled for the ~ operator
> and not for the < or > operators in the locale?
The < and > operators are compatible with the index ordering, by
definition. The trouble with ~ is that it is not doing ordering,
but pattern matching. You can exploit an index to narrow down
the candidates for a left-anchored pattern only when the index
ordering is strict lexicographic ... and in all too many non-C
locales, it ain't. Digraphs, multipass sorting rules, and things
like that are killers.
We have tried and failed (repeatedly) to find a way to use non-C
indexes for LIKE and ~. If you'd like to try again, you're more
than welcome, but I suggest you read the PG list archives to learn
the reasons why previous attempts didn't work.
The long-run answer to this will probably be to allow individual columns
to be declared as being of particular locales; then you could assign C
locale to a column you need to do pattern matching on. This seems to
require building our own locale library :-( ... so it's not going to
happen quickly.
BTW, you might think about using full-text search (contrib/tsearch
or contrib/fulltextindex) instead of pattern matching.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2002-11-13 15:13:03 | Re: index question.. |
Previous Message | Areski | 2002-11-13 14:42:13 | Re: Performance : Optimize query |