| From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Andrew Sullivan <andrew(at)libertyrms(dot)info>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: default locale considered harmful? (was Re: [GENERAL] | 
| Date: | 2003-05-31 22:18:39 | 
| Message-ID: | 200305312218.h4VMIee21738@candle.pha.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-hackers | 
Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > So, my understanding is that you would create something such as:
> > 	CREATE INDEX iix ON tab (LIKE col)
> > and that does LIKE lookups and knows how to do col LIKE 'abc%', but it
> > can't be used for >= or ORDER BY, but it can be used for equality tests?
> 
> Hm.  Right at the moment, it wouldn't be used for equality tests unless
> you spelled equality as "a ~=~ b".  I wonder whether that's necessary
> though; couldn't we dispense with that operator and use ordinary
> equality as the BTEqual member of these opclasses?  Are there any
> locales that claim that not-physically-identical strings are equal?
Let me see if I understand.
Our default indexes will be able to do =, >, <, ORDER BY, and the
special index will be able to do LIKE, ORDER BY, and maybe equals.  Do I
have that correct?
Looking at CVS, I see the warning about non-C locales has been removed. 
Should we instead mention the new LIKE index method?
	# (Be sure to maintain the correspondence with locale_is_like_safe() in selfuncs.c.)
	if test x`pg_getlocale COLLATE` != xC && test x`pg_getlocale COLLATE` != xPOSIX; then
	    echo "This locale setting will prevent the use of indexes for pattern matching"
	    echo "operations.  If that is a concern, rerun $CMDNAME with the collation order"
	    echo "set to \"C\".  For more information see the Administrator's Guide."
	fi
Doing LIKE with single-byte encodings would be easy because it would be
only 256 compares to find the min/max char values, but that doesn't work
with multi-byte encodings, right?
This LIKE/encoding problem is a tricky one because it gives poor
performance with little warning to users.
-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman(at)candle(dot)pha(dot)pa(dot)us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ron Johnson | 2003-06-01 01:31:41 | Re: Slashdot: SAP and MySQL Join Forces | 
| Previous Message | Jason Ziegler | 2003-05-31 22:14:56 | Re: pgAdmin3 snapshots | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Sean Chittenden | 2003-06-01 01:43:23 | Re: [HACKERS] Are we losing momentum? | 
| Previous Message | Dave Page | 2003-05-31 19:12:56 | The Register moving to Bricolage + PostgreSQL... |