Case preserving - suggestions

From: Shachar Shemesh <psql(at)shemesh(dot)biz>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Case preserving - suggestions
Date: 2004-06-06 17:47:20
Message-ID: 40C358A8.7090002@shemesh.biz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi list,

A postgresql migration I am doing (the same one for which the OLE DB
driver was written) has finally passed the proof-of-concept stage
(phew). I now have lots and lots of tidbits, tricks and tips for SQL
Server migration, which I would love to put online. Is pgFoundry the
right place? I understand that the code snippets section is not yet
operative, but I would still love to put it online ASAP (i.e. - before I
forget), and to have it all in one place.

One problem detected during that stage, however, was that the program
pretty much relies on the collation being case insensitive. I am now
trying to gather the info regarding adding case preserving to
Postgresql. I already suggested that we do that by changing the
procedures, and the idea was turned down. For example, a column UNIQUE
constraint must enforce that only one instance of a string be present,
case insensitive. Then again, making everything lower/upper case before
putting it in was also rejected. Case preserving is what we are looking for.

Now, one idea that floated through my mind, and I have not yet looked
into how difficult it would be to implement was to define a new system
wide collation, called, for example, en_USCI. Have that collation define
'a' and 'A' as "the same character". I'm looking for someone with more
experience with these things than me (i.e. - just about anyone) to say
whether such a thing is doable. I know I can reorder sort criteria using
collation, but can I make two characters be actually the same? As a side
note, I'll mention that MsSQL uses the collation field to define case
insensitivity.

Assuming that fails, how hard would it be to create a case insensitive
PostgreSQL? Would that be more like changing a couple of places (say,
hash computation and string compares), or would that entail making
hundreds of little changes all over the code? Is there anything in the
regression testing infrastructure that can help check such a change?

Many thanks,
Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting
http://www.lingnu.com/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albretch 2004-06-06 18:02:34 CREATE DATABASE on the heap with PostgreSQL?
Previous Message Jan Wieck 2004-06-06 17:32:12 Re: [HACKERS] Slony-I goes BETA (possible bug)