| From: | Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my> | 
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org> | 
| Cc: | Mage <mage(at)mage(dot)hu>, pgsql-general(at)postgreSQL(dot)org, Greg Stark <gsstark(at)mit(dot)edu> | 
| Subject: | Re: is this a bug or I am blind? | 
| Date: | 2005-12-17 03:49:48 | 
| Message-ID: | 5.2.1.1.1.20051217104647.02d1eef0@localhost | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
At 01:40 PM 12/16/2005 -0500, Tom Lane wrote:
>Nobody's said anything about giving up locale-sensitive sorting.  The
>question is about locale-sensitive equality: does it really make sense
>that 'tty' = 'tyty'?  Would your answer change in the context
>'/dev/tty' = '/dev/tyty'?  Are you willing to *not have access* to a
>text comparison operator that will make the distinction?
>
>I'm inclined to think that this is more like the occasional need for
>accent-insensitive comparisons.  It seems generally agreed that you want
>something like smash('ab') = smash('áb') rather than making the
>strings equal in all contexts.
I agree.
I would prefer for everything to be compared without any 
collation/corruption by default, and for there to be a function to pick the 
desired comparison behaviour ( Can all that functionality be done with the 
collate clause?).
Because most databases are multi-locale whether the humans are aware of it 
or not:
The Computer "locale", human locale #1, unknown/international locale, human 
locale #2, ...
In a column for license keys, "tty" should rarely be the same as "tyty".
In a column for base64 data (crypto hashes, etc) "tty" should NEVER be the 
same as "tyty".
In a column for domain names, I doubt it is clear whether you want to match 
tty.ibm.hu just because tyty.ibm.hu exists.
But in a column for license owner names, one might want "tty" and "tyty" to 
be the same - one might have to have a multicolumn index depending on the 
owner's locale of choice.
I recommend that for these reasons initdb should always pick "no mangled" 
text by default, no matter what the locale setting is. And that users 
should be advised of the potential consequences of mangling or I would even 
say corrupting all text in their databases by default.
Regards,
Link.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Karsten Hilbert | 2005-12-17 08:57:46 | Re: is this a bug or I am blind? | 
| Previous Message | dfx | 2005-12-17 00:47:41 | Migration tool from MS SQL server 7.0 |