Re: [HACKERS] Case Preservation disregarding case

From: "Chuck McDevitt" <cmcdevitt(at)greenplum(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Stephan Szabo" <sszabo(at)megazone(dot)bigpanda(dot)com>, "beau hargis" <beauh(at)bluefrogmobile(dot)com>, pgsql-sql(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Case Preservation disregarding case
Date: 2006-11-01 16:31:30
Message-ID: EB48EBF3B239E948AC1E3F3780CF8F88012F8CD6@MI8NYCMAIL02.Mi8.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Sorry, my last mail wasn't well thought out. Yes, the
information_schema needs the case-folded name (although it might be ok
to add additional columns to the information_schema for extra
information).

But, stepping back from all that, what is it the users want?

1) When re-creating a CREATE TABLE statement from whatever catalog
info, they'd like the names to come back exactly as then entered them.
If I do:
CREATE TABLE BobsTable (WeeklySales numeric(10,2),
"SomeStrangeName" int);

They'd like to see exactly that when the CREATE TABLE gets
re-created, not what we do now:

CREATE TABLE bobstable (weeklysales numeric(10,2),
"SomeStrangeName" int);

2) When doing reports, they'd like the name as entered to be the title
of the column:
Select * from bobstable;

Would be nice if they saw this:
WeeklySales SomeStrangeName
----------- ---------------


For compatibility with existing apps and spec compliance, you'd still
want PQfname() to return the case-folded name.
But there isn't any reason you couldn't also return a "suggested title"
field (PQftitle?) which preserves the user's case.

You could also extend the concept of a PQftitle to make nicer titles for
expressions. Instead of
SELECT sum(WeeklySales) from BobsTable;

Producing "?column?" or somesuch to use in the report, it could return a
title like "sum(WeeklySales)"

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Tuesday, October 31, 2006 10:38 PM
To: Chuck McDevitt
Cc: Stephan Szabo; beau hargis; pgsql-sql(at)postgresql(dot)org;
pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] [SQL] Case Preservation disregarding case

"Chuck McDevitt" <cmcdevitt(at)greenplum(dot)com> writes:
> Equivalent, yes. But I can interpret that clause it mean I can show
> either the case folded or non-case-folded value in the information
> schema, as they are equivalent.

Well, that's an interesting bit of specs-lawyering, but I don't see
how you can defend it against these rules in SQL99 5.2:

21) For every <identifier body> IB there is exactly one
corresponding case-normal form CNF. CNF is an <identifier
body>
derived from IB as follows.

Let n be the number of characters in IB. For i ranging from
1
(one) to n, the i-th character M(i) of IB is translated into
the
corresponding character or characters of CNF as follows.

Case:

a) If M(i) is a lower case character or a title case
character
for which an equivalent upper case sequence U is defined
by
Unicode, then let j be the number of characters in U; the
next j characters of CNF are U.

b) Otherwise, the next character of CNF is M(i).

22) The case-normal form of the <identifier body> of a <regular
identifier> is used for purposes such as and including
determination of identifier equivalence, representation in
the Definition and Information Schemas, and representation
in
diagnostics areas.

NOTE 44 - Any lower-case letters for which there are no
upper-
case equivalents are left in their lower-case form.

Again, obviously we are not compliant because we fold to lower rather
than upper case, but I do not see how you can read (22) as not requiring
the information schema to show the upper-cased form. The output of
functions such as PQfname() might be considered closer to diagnostics
info than information schema, but that's covered too.

But the really serious problem with what you propose is that it would
allow two table columns with names that the system considers distinct
to show as the same string in the information schema and diagnostic
outputs. That can't be acceptable --- it's going to break any
application that does any nontrivial analysis of what it sees there,
not to mention that it violates various primary key constraints in
the information schema specification.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-11-01 17:05:25 Re: Writing WAL for relcache invalidation: pg_internal.init
Previous Message korryd 2006-11-01 15:58:52 [Fwd: pg_migrator: in-place upgrade tool at pgFoundry]

Browse pgsql-sql by date

  From Date Subject
Next Message Craig Servin 2006-11-01 16:50:33 Statement Triggers and access to other rows in the transaction
Previous Message Raghuraman K 2006-11-01 16:17:44 Distribution of results