Skip site navigation (1) Skip section navigation (2)

Bringing PostgreSQL torwards the standard regarding case folding

From: Shachar Shemesh <psql(at)shemesh(dot)biz>
To: PostgreSQL development <pgsql-hackers(at)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>,Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>,Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject: Bringing PostgreSQL torwards the standard regarding case folding
Date: 2004-04-25 14:44:23
Message-ID: 408BCEC7.4080503@shemesh.biz (view raw or flat)
Thread:
Lists: pgsql-hackers
I'm opening a new thread, as the previous one was too nested, and 
contained too much emotions.

I'll start by my understanding of a summary of the thread so far. The 
solution we are seeking would have to satisfy the following conditions:
1. Setting should be on a per-database level. A per-server option is not 
good enough, and a per-session option is too difficult to implement, 
with no apparent justifiable return.
2. Old applications already working with PG's lowercase folding should 
have an option to continue working unmodified for the foreseeable future.

Solutions offered so far, and their status:
1. Tri-state. Folder upper, if failes, fold lower, if succeeds, warn.
2. Dual state. Fold lower or upper. Break if client is broken.
3. Create a database conversion tool to change existing case.

Solution 1:
As currently the case folding is performed disjointed from the actual 
use of the identifier, this solution requires quite a big amount of 
work. On the other hand, and on second thought, it's main benefit - 
gradual transition of applications from one to the other, is not really 
necessary once you declare the current behaviour as there to stay. 
Existing applications can simply choose to continue using whatever 
method they currently use. No need for migration.

As such, I think we can simply state that tri-state migration path 
solution can be discarded for the time being.

Solution 2:
Obviously, this is the way to go. We will have a dabase attribute that 
states whether things are lower or upper case there.

Solution 3:
(unrelated to the above)
There seems to be some ambiguity about how to handle the translation. 
Such a tool seems to require guessing which identifiers are accessed 
quoted, unquoted, or both. The last option, of course, will never work.

We may need such a tool, for some projects may wish to transform from 
one way to the other. It seems to me, however, that such a tool can wait 
a little.

Open issues:
1. What do we do with identifiers in "template1" upon database creation?
2. How do we handle queries to tables belonging the catalog that are 
shared between databases?

Observation: on a lowercase folding DB, any identifier that is not 
composed only of lowercase characters MUST can be automatically assumed 
to be accessed only through quoted mode.

I therefor suggest the following path to a solution:
1. CreateDB will be able to create databases from either type.
2. template1 will be defined to be one or the other. For the sake of 
this discussion, let's assume it's lowercase (current situation)
3. CreateDB, upon being asked to create a new DB that has uppercase 
folding, will copy over template1, as it currently does.
4. While copying, it will check each identifier. If the identifier is 
not lowercase only, it is safe to copy it verbatim.
5. If the identifier is lowercase only, convert it to uppercase only. I 
am assuming here that the authors of the client code chose an 
uppercase-folding database, so they should know what they are doing when 
accessing stuff from the standard offering.
6. I'm not sure what are the shared tables from the catalog. I don't 
think it so unreasonable to ask anyone doing catalog work to assume that 
catalog entries are case-sensitive. As such, maybe it's best to just 
leave the data as is.
7. Column headers, however, will have to have a solution. A point still 
open in current design.

I'm hoping this summary helps in furthering the discussion.

-- 
Shachar Shemesh
Lingnu Open Source Consulting
http://www.lingnu.com/


Responses

pgsql-hackers by date

Next:From: Rod TaylorDate: 2004-04-25 15:11:33
Subject: Re: Bringing PostgreSQL torwards the standard regarding
Previous:From: Peter EisentrautDate: 2004-04-25 14:22:28
Subject: Re: [HACKERS] What can we learn from MySQL?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group